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ABSTRACT 

In this paper we study a key phase in the formation of massive galaxies; the transition of star forming galaxies 
into massive (Mstars compact (r^ ^ 1 kpc) quiescent galaxies, which takes place from z ^ 3 to 

1.5. We use HST grism redshifts and extensive photometry in all five 3D-HST/CANDELS fields, more 
than doubling the area used previously for such studies, and combine these data with Keck MOSEIRE and 
NIRSPEC spectroscopy. We first confirm that a population of massive, compact, star forming galaxies exists 
at z > 2, using K-band spectroscopy of 25 of these objects at 2.0 < z < 2.5. They have a median [Nll]/Ha 
ratio of 0.6, are highly obscured with SFR(tot)/SER(Ha) ~ 10, and have a large range of observed line widths. 

We infer from the kinematics and spatial distribution of Ha that the galaxies have rotating disks of ionized gas 
that are a factor of ~ 2 more extended than the stellar distribution. By combining measurements of individual 
galaxies, we find that the kinematics are consistent with a nearly Keplerian fall-off from Wot ~ 500km s“* at 
1 kpc to Wot ~ 250 km s“' at 7 kpc, and that the total mass out to this radius is dominated by the dense stellar 
component. Next, we study the size and mass evolution of the progenitors of compact massive galaxies. Even 
though individual galaxies may have had complex histories with periods of compaction and mergers, we show 
that the population of progenitors likely followed a simple inside-out growth track in the size-mass plane of 
Alogre ~ 0.3 A log Mstars. This mode of growth gradually increases the stellar mass within a fixed physical 
radius, and galaxies quench when they reach a stellar density or velocity dispersion threshold. As shown in 
other studies, the mode of growth changes after quenching, as dry mergers take the galaxies on a relatively 
steep track in the size-mass plane. 

Keywords: galaxies: evolution — galaxies: structure 


1 . INTRODUCTION 

Many studies have shown that massive galaxies with low 
star formation rates were remarkably compact at z > 2 (e.g., 
Daddi et al. 2005; Trujillo et al. 2006; van Dokkum et al. 
2008; Damjanov et al. 2011; Conselice 2014). At fixed stel¬ 
lar mass of Mstars ~ 10 " M 0 , quiescent galaxies are a fac¬ 
tor of rv. 4 smaller at z = 2 than at z = 0 (e.g., van der Wei 
et al. 2014b). As the stellar mass of the galaxies also evolves, 
the inferred size growth of individual galaxies is even larger 
(van Dokkum et al. 2010; Patel et al. 2013). It is unlikely 
that all massive galaxies in the present-day Universe had a 
compact progenitor (van Dokkum et al. 2008, 2014; Eranx 
et al. 2008; Newman et al. 2012; Poggianti et al. 2013; Belli, 
Newman, & Ellis 2014a); however, the vast majority of com- 
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pact, massive galaxies that are observed at z = 2 ended up in 
the center of a much larger galaxy today (Belli et al. 2014a; 
van Dokkum et al. 2014). Their size growth after z = 2 is 
probably dominated by minor mergers: such mergers are ex¬ 
pected, and other mechanisms cannot easily produce the ob¬ 
served fe/Mstars ~ 2 Scaling between size growth and mass 
growth (Bezanson et al. 2009; Naab, Johansson, & Ostriker 
2009; Hopkins et al. 2010; Trujillo et al. 2011; Hilz, Naab, & 
Osti'iker 2013). 

It is not yet clear how these massive, extremely compact 
galaxies were formed, and this question has significance well 
beyond the somewhat narrow context of the size evolution of 
quiescent galaxies. The dense centers of massive galaxies to¬ 
day are home to the most massive black holes in the Universe 
(Magorrian et al. 1998); have an enrichment history that is 
very different from that of the Milky Way (Worthey, Eaber, 
& Gonzalez 1992); and probably had a bottom-heavy stellar 
initial mass function (IME) (Conroy & van Dokkum 2012). 
All these characteristics are the product of processes that took 
place in the star forming progenitors of z ^ 2 massive quies¬ 
cent galaxies. Eurthermore, stars in very dense regions repre¬ 
sent only a very small fraction (^ 0.1 %) of the stellar mass 
in the Universe today, but their contribution rises sharply with 
redshift: depending on the IMF, stars inside dense cores with 
^r<ikpc > 3 X IO'^Mq may contribute 10% - 20% of the 
stellar mass density at z > 2 (van Dokkum et al. 2014). 

The formation of compact massive galaxies requires large 
amounts of gas to be funneled in a region that is only 1-2 
kpc in diameter, while preventing significant star formation at 
larger radii. Galaxy formation models have been able to re¬ 
produce the broad characteristics of compact massive galax- 
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ies, either by mergers that are accompanied by a strong cen¬ 
tral star burst (e.g., Hopkins et al. 2009b; Wuyts et al. 2010; 
Wellons et al. 2015), by in-situ formation from highly effi¬ 
cient gas cooling (Naab et al. 2009; Wellons et al. 2015), 
or by contraction (“compaction”) of star forming gas disks 
(Dekel & Burkert 2014; Zolotov et al. 2015). These scenar¬ 
ios have testable predictions: for example, if compact mas¬ 
sive galaxies formed in mergers then they may be expected 
to show tidal features. Furthermore, the star formation rates 
of galaxies, and their evolution in the size-mass plane, can be 
compared to observations. 

Observationally, the challenge is to identify these star form¬ 
ing progenitors of compact massive galaxies. Once they are 
found they can be studied, to measure the physical conditions 
inside them and to test proposed mechanisms for their for¬ 
mation (see Barro et al. 2013, 2014b; Nelson et al. 2014; 
Williams et al. 2014, 2015, for examples of such studies). 
The main observational complication is that typical quiescent 
galaxies at z > 2 are structurally very different from typical 
star forming galaxies (see, e.g., Franx et al. 2008). At fixed 
mass, star forming galaxies are larger, have a lower Sersic 
(1968) index and, as a result, a much lower central density 
(e.g., Franx et al. 2008; Kriek et al. 2009a; van der Wei et al. 
2014b; van Dokkum et al. 2014). It may be that a subset of 
the star forming galaxies decrease their size through mergers 
or “compaction”, but it would be difficult to pinpoint which 
among the many large, star forming galaxies are destined to 
go through these phases. A similar problem arises when link¬ 
ing compact, quiescent descendants at z = 2 to (lower mass) 
star forming galaxies at much higher redshift (Williams et al. 
2014, 2015): although there may be progenitors of massive 
quiescent galaxies among small, blue, low mass star forming 
galaxies at z > 3, most of those galaxies will likely follow 
different paths. 

Barro et al. (2013, 2014b) and Nelson et al. (2014) use 
a relatively model-independent and straightforward way to 
identify plausible progenitors: they select massive star form¬ 
ing galaxies at z > 2 with the same small sizes as quiescent 
galaxies. These objects form the compact tail of the size 
distribution of star forming galaxies: for every massive star 
forming galaxy at z = 2- 2.5 that is compact, there are sev¬ 
eral that are not (see Sect. 12.31 and van der Wei et al. 2014b). 
It seems plausible that star forming galaxies with the same 
structure as quiescent galaxies are the direct ancestors of these 
galaxies, and there may be physical reasons why the most 
compact star forming galaxies are the most likely to shut off: 
many proposed quenching and maintenance mechanisms op¬ 
erate most effectively when a significant bulge (and associ¬ 
ated black hole) has formed (Croton et al. 2006; Hopkins 
et al. 2008; Johansson, Naab, & Ostriker 2009; Conroy, van 
Dokkum, & Kravtsov 2015). 

In this paper we build on previous studies by identify¬ 
ing a sample of massive, compact, star forming galaxies at 
z = 2-2.5 in the 3D-HST survey (van Dokkum et al. 2011; 
Brammer et al. 2012b; Skelton et al. 2014). We study all 
five 3D-HST/CANDELS fields in a homogeneous way, pro¬ 
viding improved measurements of the number density of can¬ 
didate compact galaxies in formation. We present extensive 
Keck spectroscopy of a subset of these candidates, and mea¬ 
sure redshifts, emission line widths, and emission line ratios. 
The Ha line profile and spatial extent is used to probe the 
potential beyond the stellar effective radius, allowing us to re¬ 
construct the average rotation curve of this class of objects. 
In the second part of the paper we discuss a framework for 


the formation and evolution of massive galaxies that places 
the results of the Keck spectroscopy in context. We show 
that, even though individual galaxies likely have complex for¬ 
mation histories, the evolution of the population of massive 
galaxies can be described with a simple model in which galax¬ 
ies follow parallel tracks in the size-mass plane. For consis¬ 
tency with previous studies we assume flm = 0.3, Ha = 0.7, 
and//o = 70kms“^ Mpc“^ 

2. COMPACT MASSIVE STAR FORMING GALAXIES 
2.1. Catalogs and Derived Parameters 

We use data from the 3D-HST project (van Dokkum et al. 
2011; Brammer et al. 2012b) to identify candidate com¬ 
pact massive galaxies. The 3D-HST catalogs (Skelton et al. 
2014) provide multi-band photometry for objects in the five 
extra-galactic fields of the CANDELS survey (Grogin et al. 
2011; Koekemoer et al. 2011). Objects were selected using 
a signal-to-noise (S/N) optimized combination of the WEC3 
7 i25, JHi 4 o, and //leo images. The catalogs encompass nearly 
all publicly available data in the CANDELS fields, including 
deep IRAC data, as well as medium-band imaging in the op¬ 
tical and the near-IR. Stars were excluded, as well as objects 
that have use_phot=0 (see Skelton et al. 2014). 

The imaging data are combined with 3D-HST WFC3 G141 
grism spectroscopy, which - together with data from program 
GO-11600 - covers Ri 80% of the CANDELS photometric 
area (see Brammer et al. 2012b). The analysis of the com¬ 
bined photometric and spectroscopic dataset will be described 
in detail in I. Momcheva et al., in preparation. Briefly, the 
photometric data from Skelton et al. (2014) and the 2D grism 
data were fit simultaneously with a modified version of the 
EAZY code (Brammer, van Dokkum, & Coppi 2008) to mea¬ 
sure redshifts, rest-frame colors, and the strengths of emis¬ 
sion lines (Brammer et al. 2012a). If there are no significant 
emission or absorption features in the grism spectrum, or if 
no grism spectrum is available, the fit is similar to a standard 
photometric redshift analysis. In version 4.1.4 of our data re¬ 
lease spectra are extracted only to //igo < 24 (and obviously 
only in the area covered by the grism observations). 

In addition to the Skelton et al. photometric information and 
the grism spectroscopy we use Spitzer MIPS 24 pm data to 
estimate total IR luminosities and star formation rates, as de¬ 
scribed in Whitaker et al. (2012,2014). These IR luminosities 
and star formation rates are consistent (within a factor of ^ 2) 
with those derived from the full mid- and far-IR SEDs, at least 
for the IR-luminous galaxies that have reliable far-IR photom¬ 
etry (see, e.g., Muzzin et al. 2010; Elbaz et al. 2011; Wuyts 
et al. 2011; Utomo et al. 2014). 

Structural parameters of galaxies in the Skelton et al. cat¬ 
alogs were measured by van der Wei et al. (2014b), using 
the methodology described in van der Wei et al. (2012). 
Sizes, total luminosities, and ellipticities were measured from 
the WEC3 imaging using the GALAPAGOS implementation 
(Barden et al. 2012) of GALEIT (Peng et al. 2002). In Sect. 
7.2 we show with a stacking analysis that the structural pa¬ 
rameters in the van der Wei et al. (2014b) catalogs are reli¬ 
able for the compact, massive galaxies studied in this paper. 
The catalog contains a small number of “catastrophic” fail¬ 
ures. To identify these, we compared the total galaxy fluxes 
from the GALEIT fit to the total fluxes in the Skelton et al. 
catalogs. Galaxies were excluded from the analysis if the ab¬ 
solute difference between these two measurements exceeds 
0.5 magnitudes. In this paper we use circularized half-light 
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radii throughout, defined as 

logre = logre,a + 0.51ogq', (1) 

with re,a the half-light radius along the major axis and q = b/a 
the axis ratio of the galaxy. The sizes are determined from 
data in the H\^q band, which corresponds to rest-frame g at 
z = 2.3. 

Finally, stellar masses were determined from fits of stellar 
population synthesis models to the 0.3 /rm - 8 /rm photome¬ 
try, as described in Skelton et al. (2014). The fits were done 
with the FAST code (Kriek et al. 2009b), using a Chabrier 
(2003) IMF, the Calzetti et al. (2000) attenuation law, and 
exponentially-declining star formation histories. These pa¬ 
rameters were chosen for consistency with previous stud¬ 
ies; small changes such as using “delayed r” models do not 
change the masses significantly. In this paper we do not use 
the best-fitting star formation rates, ages, or extinction from 
these fits, as they tend to be less robust than the stellar masses 
(see, e.g., Kriek et al. 2009b; Muzzin et al. 2009a). A small 
(typically ~ 5 %) correction was applied to each galaxy to 
make its half-light radius and stellar mass self-consistent: 

logMstars = logMstars.FAST + log(LG/Ttot), (2) 

with Lq the total H band luminosity as implied by the GAL- 
FIT fit and Ltot the total H band luminosity in the Skelton et 
al. catalog (see Taylor et al. 2010a; van Dokkum et al. 2014). 

2.2. Selection of Star Forming Galaxies 

In this paper we use the rest-frame colors of galaxies to sep¬ 
arate (candidate) star forming galaxies from quiescent galax¬ 
ies. As shown by Labbe et al. (2005), Wuyts et al. (2007), 
Whitaker et al. (2011), and many others, galaxies occupy dis¬ 
tinct regions in the space spanned by the rest-frame U-V 
and V -J colors, depending on their specific star formation 
rate. The reason is that dust and age have a subtly different 
effect on the spectral energy distributions (SEDs) of galaxies: 
galaxies that are young and dusty are red in both U-V and 
V-J, whereas galaxies that are old and dust-free are red in 
U-V but (relatively) blue in V-J. With high quality red- 
shifts and photometry it has been demonstrated that there is 
a gap between the (age-)sequence of quiescent galaxies and 
the (dust-)sequence of star forming galaxies in the UVJ plane 
(Whitaker et al. 2011; Brammer et al. 2011), leading to a 
relatively unambiguous separation of the two galaxy classes. 

The distribution of galaxies with logMstars > 10.6 and 2.0 < 
z < 2.5 in the UVJ plane is shown in Fig. [T] The quiescent 
box is indicated with the black lines; galaxies inside this box 
satisfy the equations 

y-7<1.5, 

U-V>1.3, 

U-V>0.S{V-J) + 0.1. (3) 

Galaxies are color-coded by their specific star formation rates, 
defined as SSFR = SFR/Mjtars, with SFR the star formation 
rate derived from their UVh-IR emission (see Whitaker et al. 
2014, and references therein). As can be seen in Fig. 1 the 
UVJ selection corresponds very well to a selection on specific 
star formation rate. This was expected from previous studies 
(e.g., Wuyts et al. 2011); nevertheless, the correspondence is 
striking as the MIPS 24 fim measurements (which dominate 
the star formation rates in this stellar mass range) are entirely 
independent from the U-V and V-J colors. 



Figure 1. Distribution of galaxies with logMstars > 10.6 Mq and 
2.0 < z < 2.5 in the UVJ plane. The galaxies are color-coded by 
the logarithm of their specific star formation rate, SSFR = SFR/Mstars. 
The star formation rates are derived from the UV-l-IR emission, with 
the IR emission determined from the Spitzer/MIPS flux. In this pa¬ 
per “star forming galaxies” refers to all objects outside of the UVJ 
quiescent box. 

We note that a subset of quiescent galaxies has high SSFRs 
in Fig. [TJ these are galaxies whose rest-frame optical/near-IR 
SEDs show no signs of star formation even though they have 
high MIPS 24 fim fluxes. These galaxies are difficult to inter¬ 
pret: they may be quiescent galaxies with an active nucleus, or 
their star formation is so obscured that the young stars do not 
contribute significantly to the SED. Eumagalli et al. (2014) 
show that the optical/near-IR SEDs of these galaxies are very 
similar to the ones that have no MIPS detection. Approxi¬ 
mately 20 % of galaxies in the Barro et al. (2013) sample fall 
in this category. 

Of 582 galaxies with logMstars > 10.6 and 2.0 < z < 2.5, 
185 (32 %) are quiescent and 397 (68 %) are star forming. The 
total area of the five fields is 896 arcmin^, and the number den¬ 
sities of massive quiescent galaxies and massive star forming 
galaxies are 1.2 x 10“^Mpc“^ and 2.7 x 10“"^Mpc“^ respec¬ 
tively. These numbers are consistent with previous measure¬ 
ments from other datasets (e.g., Marchesini et al. 2009; Bram¬ 
mer et al. 2011; Muzzin et al. 2013). 

2.3. Selection of Compact Massive Star Forming Galaxies 

The size-mass relation for galaxies in the 3D-HST survey 
with 2.0 < z < 2.5 is shown in Pig. |2l Quiescent and star 
forming galaxies, identified using Eq.[3l are indicated with red 
and blue points respectively. As is well known, star forming 
galaxies are larger than quiescent galaxies at fixed mass (e.g., 
Pranx et al. 2008; Williams et al. 2010; van der Wei et al. 
2014b). Note that the galaxy distribution in Pig.|2]is displaced 
with respect to that in Pig. 5 of van der Wei et al. (2014b), as 
we use circularized half-light radii and van der Wei et al. use 
half-light radii along the major axis. 

Compact massive galaxies (CMGs) are in the lower right 
portion of the size-mass diagram. Barro et al. (2013) use 
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Figure!. Size-mass relation for galaxies with 2.0 < z < 2.5. Sizes 
are circularized half-light radii. Red symbols are (7Ry-selected qui¬ 
escent galaxies, blue symbols are star forming galaxies. The solid 
lines shows our selection criteria for compact, massive galaxies: 
logMstars > 10.6 and log re < logA/stars “ 10.7. This criterion is more 
restrictive than that used by Barro et al. (2013, 2014b) (dashed line); 
we did not use the Barro et al. criterion as 60 % of star forming galax¬ 
ies with logMstars > 10.8 fall below the dashed line, and their median 
size is significantly larger than that of massive quiescent galaxies. 

the criterion log (logMstars“ 10.3)/l.5 to isolate com¬ 
pact galaxies (dashed line in Fig. |2]i. However, at masses of 
^10*' Mq this selection does not produce a sample of com¬ 
pact star forming galaxies that is directly comparable to com¬ 
pact quiescent galaxies. The median size of quiescent galax¬ 
ies with logMstars > 10.8 that satisfy the Barro et al. compact¬ 
ness criterion is = 1.3 kpc. The median size of star form¬ 
ing galaxies with logMstars > 10.8 that satisfy this criterion is 
2.2 kpc. For comparison, the median size of the full sample 
of star forming galaxies with logMstars > 10.8 is 2.8 kpc. That 
is, at high masses, the Barro et al. criterion selects star form¬ 
ing galaxies whose sizes are closer to those of the full sample 
of star forming galaxies than to those of compact quiescent 
galaxies. The reason is that the Barro et al. “compactness” 
criterion is not very restrictive at high masses, as it selects 
60% of all star forming galaxies that have logMstars > 10.8. 

As our goal is to select plausible progenitors of massive, 
compact quiescent galaxies we adopt a slightly more restric¬ 
tive criterion: 

log re < logMstars “10.7, (4) 

with Mstars in units of Mq and re in units of Iqpc. This limit 
is indicated by the solid diagonal line in Fig. |2] Thirty-nine 
percent of star forming galaxies with logMstars > 10.8 satisfy 
this criterion and their median size is re = 1.8 kpc. As we 
discuss below, the slope of unity of our compactness criterion 
can be readily interpreted in terms of a physical parameter, 
namely the velocity dispersion. The slope of 1/1.5 = 0.67 
used by Barro et al. (2013) was chosen to be consistent with 
the slope of the size-mass relation of quiescent galaxies as 
found by Newman et al. (2012). We note that van der Wei et 
al. (2014b) find a slightly steeper slope than Newman et al. 


(2012) at z - 2.3 (0.76 ± 0.04 versus 0.69 ± 0.17). 

In addition to their compactness criterion Barro et al. apply 
a mass limit of logMstars >10. This relatively low limit is also 
used for their comparison samples of quiescent galaxies and 
spatially-extended star forming galaxies. However, very few 
galaxies that have Mstars IO'^Mq at z = 2 will grow into 
IHstars ~ IO'^Mq galaxies by z = 0 (e.g., van Dokkum et al. 
2010; Leja, van Dokkum, & Franx 2013a; Behroozi et al. 
2013). We therefore apply a mass limit that is higher by a 
factor of 4: logMstars > 10.6. This selection produces homo¬ 
geneous samples of massive compact galaxies. Another con¬ 
sideration when choosing this mass limit is that sizes are un¬ 
certain when the effective radius is significantly smaller than 
the pixel size (the drizzled pixel size is 0706, corresponding 
to 0.5 kpc at z = 2). 

In the remainder of the paper we will use “CMG”, 
for “Compact Massive Galaxy”, to denote objects with 
logMstars > 10.6 and logrg < logMstars” 10.7. Based on their 
location in the UVJ diagram we distinguish “qCMG”, for qui¬ 
escent compact massive galaxy, and “sCMG”, for star form¬ 
ing compact galaxy. There are 112 sCMGs at 2.0 < z < 2.5 
in the five 3D-HST/CANDELS fields. Five of these have ef¬ 
fective radii < 0.5 kpc; when calculating dynamical masses 
and expected velocity dispersions of these galaxies we use 
0.5 kpc instead of their best-fitting radius. It should be noted 
that many of the star forming progenitors of 2 < z < 2.5 
qCMGs are expected to be at higher redshift than z = 2.5; we 
discuss the evolution of sCMGs and qCMGs in Sections 7 and 
8 . 

2.4. Expected Galaxy-Integrated Velocity Dispersions and 
Number Densities 

We quantify the compactness of galaxies by their expected 
galaxy-integrated velocity dispersion, as this quantity follows 
directly from our size- mass selection and can be compared to 
observations (see Sect. lO l. For simplicity, we use the fol¬ 
lowing relation: 

log CTpred = 0.5 (logMstars ” log r^-5 .9) , (5) 

with (Tpted the predicted velocity dispersion in kms“', Mstars 
in units of M 0 , and re in units of kpc (Franx et al. 2008; van 
Dokkum, Kriek, & Franx 2009). This relation has been shown 
to reasonably predict the observed stellar velocity dispersions 
of both quiescent galaxies and star forming galaxies, at least 
in the regime where this has been tested: out to z ^ 0.7 for 
massive star forming galaxies (Taylor et al. 2010a; Bezanson, 
Franx, & van Dokkum 2015) and out to z ^ 2 for massive 
quiescent galaxies (Bezanson et al. 2013; van de Sande et al. 
2013; Belli et al. 2014a). 

Our compactness criterion (Eq. S corresponds to 
logtjpred > 2.40, or (Tpred > 250kms“^ The distribu¬ 
tions of predicted dispersions of sCMGs and qCMGs are 
shown by the histograms in Eig. [3] The median expected 
dispersions of the two populations are similar but not 
identical: cTpred = 324kms“' for quiescent galaxies and 
Cpred = 284kms“^ for star forming galaxies. The reason 
for this difference is that the size distribution of quiescent 
galaxies is different from that of star forming galaxies. Eor 
star forming galaxies we select the tail of the distribution, 
with the largest number of galaxies close to the compactness 
cutoff, whereas for quiescent galaxies we select the bulk of 
the population (see van der Wei et al. 2014b, for a discussion 
of the form of the size distributions of quiescent and star 
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forming galaxies). Phrased differently, irrespective of the 
exact compactness criterion, the smalle st ga laxies tend to be 
quiescent. We will return to this in Sect. 18. II where we define 
a “quenching line” just inside the compact massive galaxy 
box. 

As shown in Taylor et al. (2010a), the residuals between 
expected and observed dispersions correlate with the Sersic 
index. The lines in Fig. |3] show the distributions when the 
Sersic index of the galaxies is taken into account, using 

logCTpred =0.5(logG+log/3(n) + logMstars-logre), (6) 

with 

/3(n) = 8.87-0.831« + 0.0241n2 (7) 

(Cappellari et al. 2006). Here n is the Sersic index and 
G = 4.31 X 10“^ when Mstars is in units of Mq, is in kpc, 
and (Tpred is in km s“'. sCMGs have a slightly smaller median 
Sersic index ((«) = 2.4) than qCMGs ((n) = 2.9). For quies¬ 
cent galaxies the line and histogram are nearly the same, but 
for star forming galaxies the Sersic-dependent dispersions are 
on average « 10 % lower than those calculated with Eq.|5] 



Figures. Distribution of expected galaxy-integrated velocity dis¬ 
persions at 2.0 < z < 2.5, for quiescent compact massive galax¬ 
ies (qCMGs; red) and for star forming compact massive galax¬ 
ies (sCMGs; blue). Histograms use a simple relation of the form 
oc Mstars/re. Our compactness criterion corresponds to (Tpred > 
250 km s“'. Lines use an expression that takes the Sersic index of the 
galaxies into account. sCMGs have a median predicted dispersion of 
284kms^‘. 

The number density of qCMGs and sCMGs is the same, 
0.8 X 10“"^Mpc“^ (for reference, the number density of the 
full population of quiescent galaxies with logMjtars > 10.6 is 
1.2 X 10“^ Mpc“^; see Sect. 12.21) . This result is consistent with 
previous studies that noted the overlap of the compact tail of 
star forming galaxies and the bulk of the quiescent population 
(Barro et al. 2013; van der Wei et al. 2014b). We therefore 
confirm that a population of star forming galaxies can be iden¬ 
tified at 2.0 < z < 2.5 that has a median mass, median size. 


Table 1 

Coordinates of Confirmed Star Forming Compact Massive Galaxies 


id^ 

RA 

DEC 

^606 

Hlio 

AEGIS_9163 

14''21“03?68 

53°04'37"3 

25.8 

23.2 

AEGIS_26952 

14h20"'40?81 

53°04'51"9 

25.2 

22.2 

AEGIS_4in4 

14''18“32?92 

52°46'06"7 

25.1 

22.7 

COSMOS_163 

10''00“25?01 

2° 10'44" 1 

25.9 

23.2 

COSMOS_1014 

10‘’00™35?92 

2°11'27('8 

23.1 

21.5 
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“ Id number in Skelton et al. (2014). 

Confirmation from Barro et al. (2014b); RA, DEC, ^606 and //i 60 from 
Skelton et al. (2014). 


and number density similar to the population of massive qui¬ 
escent galaxies at the same redshifts. If all these compact star 
forming galaxies quench in the near future, the number den¬ 
sity of massive quiescent galaxies will increase by 70%, and 
the number density of qCMGs will double. 

3. NEAR-IR SPECTROSCOPY 

We observed candidate sCMGs with the near-IR spectro¬ 
graphs MOSEIRE (McLean et al. 2012) and NIRSPEC 
(McLean et al. 1998) on Keck in 2014 and 2015. The 
resulting spectra provide spectroscopic redshifts (measured 
from Ha and [Nil] at 2.0 < z < 2.7), which can be used 
to verify that a population of sCMGs exists at these red¬ 
shifts. Eurthermore, the spectroscopic observations provide 
galaxy-integrated kinematics of the ionized gas: if compact 
star forming galaxies are in the process of forming the stars 
that are later in compact quiescent galaxies, their gas kine¬ 
matics should be similar to the stellar kinematics of quiescent 
galaxies. In addition to redshifts and kinematics the spectra 
provide star formation rates and strong line ratios; these are 
important for understanding the physical processes that take 
place in these galaxies, although their interpretation is often 
not unique. 

3.1. MOSFIRE 

The MOSEIRE spectra were obtained in three separate ob¬ 
serving runs: January 11, 12 2014; April 18, 23, 25 2014; and 
Dec 12, 13, 15 2014. The January run suffered from clouds 
and poor seeing; conditions were generally good during the 
other two runs. Compact, massive star forming galaxies were 
not always the main targets, and were not always selected us¬ 
ing the criteria of Sect. 2.2. One target from the April run, a 
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galaxy at z = 7.730, is described in Oesch et al. (2015). The 
December run gave higher priority to galaxies at 3.0 < z < 3.6 
than to galaxies at lower redshift. In this paper we will limit 
the discussion to star fo rmin g galaxies at 2 < z < 2.5 that sat¬ 
isfy the criteria of Sect. 12.31 

The observations were all taken in the /f-band, using a stan¬ 
dard ABAB dither pattern. The exposure times varied from 
^ Ihr to ^ 4hrs, depending on conditions and the require¬ 
ments imposed by the primary targets in the masks. One of 
the slits in each mask was devoted to a relatively bright, rel¬ 
atively blue star. This has four important functions: the S/N 
ratio of the star is used to weight individual exposures in the 
reduction; the y-position of the star is used to correct the data 
for small vertical drifts of the mask relative to the sky (see 
Kriek et al. 2015); the extracted spectrum is used to identify 
regions of strong sky absorption; and the width of the 2D stel¬ 
lar spectrum in the spatial direction provides us with a model 
of the point spread functio n (PS F) that is otherwise very diffi¬ 
cult to construct (see Sect. 16.2b . 

The data reduction used the standard MOSFIRE pipeline 
DRPFI with small modifications (see Oesch et al. 2015). In¬ 
dividual sequences were reduced and shifted to a common ref¬ 
erence frame before stacking. One-dimensional spectra were 
obtained from the 2D spectra by summing rows, as dictated 
by the observed spatial extent of the galaxies. For each mask 
an empirical noise spectrum was created by removing all rows 
with signal, and determining the width of the pixel distribu¬ 
tion of the remaining rows for each pixel in the wavelength 
direction. The width was measured by removing the lowest 
and highest 16 % of values, and is therefore equivalent to the 
±1 ct width of a Gaussian. For each individual galaxy in a 
mask the noise spectrum was multiplied by the square root of 
the number of rows that was summed to create the ID spec¬ 
trum of that object. 

3.2. NIRSPEC 

The NIRSPEC data were obtained in two runs, January 10, 
13, 14 2014 and January 25, 26 2015. Conditions were poor 
in the 2014 run and the only object in our final sample that 
came from it is GOODS-N_774, which was published in Nel¬ 
son et al. (2014). Conditions in 2015 were excellent, with the 
seeing ranging from 0."3 -0."6 during both nights. The selec¬ 
tion for the NIRSPEC runs was very similar to that described 
in Sect. 12.31 within these criteria priority was generally given 
to galaxies with higher star formation rates (and with good 
blind offset stars; see below). 

We followed standard observing procedures for NIRSPEC 
spectroscopy of faint targets (see, e.g., Erb et al. 2003; van 
Dokkum et al. 2004). Target aquisition was done with blind 
offsets from nearby stars, as the galaxies are not detected in 
the SCAM slit-viewing camera. The N6 filter was used for 
GOODS-N_774; all data in the 2015 run were taken with the 
N7 filter. A typical observing sequence consisted of four 900 s 
exposures in an ABBA pattern with 1" offsets between nods. 
The data were continuously inspected as objects sometimes 
drift out of the slit. 

The data reduction followed standard procedures for near- 
IR, single slit data (see, e.g., van Dokkum et al. 2004). The 
data were initially reduced in pairs, using the sky of the A 
frame for the B frame and vice versa. This method yields rel¬ 
atively clean, photon noise-dominated spectra, at the expense 
of reducing the S/N in the final frames by (see, e.g., Kriek 

* * https://code.google.eom/p/mosfire/ 


et al. 2015). Wavelength calibration was done using sky lines, 
which were also used to determine the spectral resolution of 
the data (see Sect. 13.4.11) . The slit is not long enough to obtain 
an accurate noise spectrum from empty regions; therefore, we 
calculate the noise spectrum from the sky spectrum and the 
noise in the darks. An analysis of the residuals from fits to the 
emission lines s hows that this is sufficient for our purposes 
(see Sect. l3.4.TI) . 

3.3. Results and Comparison to Parent Sample 

We identify the redshifted Ha and [Nil] emission lines in 
20 out of 24 compact, massive star forming galaxies with ex¬ 
pected redshifts in the range 2.0 < z < 2.5. This success rate 
of 86 % is encouraging^ but it should be noted that our se¬ 
lection at the telescope was somewhat subjective, particularly 
in the NIRSPEC runs. As an example, if there were two plau¬ 
sible targets and one showed a hint of an Ha contribution to 
the broad band flux we would generally give it preference. 
Additionally, there are five non-overlapping galaxies in Barro 
et al. (2014b) that satisfy our criteria (see Sect. 3.5); the to¬ 
tal sample of massive compact star forming galaxies with Ha 
measurements is therefore 25 (Table 1). 

The properties of the galaxies in the spectroscopic sample 
are compared to the parent sample in Eig.|4l The median size 
and mass are rg = 1.3kpc and Mjtars = 1.0 x 10 ^'Mq respec¬ 
tively, close to the medians of the parent sample. The spread is 
somewhat smaller; 24 out of 25 galaxies are in the mass range 
10.7 < logMstare < 11.3. The galaxies have bluer U-V colors 
and slightly higher UV-i-IR star formation rates than the parent 
sample. This is by selection: galaxies with specific star for¬ 
mation rates SSER < 10“^ yr“* were given lower priority. De¬ 
spite the lack of galaxies with low star formation rates in the 
spectroscopic sample, the median SSER is only 0.1 dex higher 
than that of the parent sample (log SSER = -8.8 yr“' compared 
to logSSFR = -8.9 yr“^ for the parent sample). Both medians 
are close to the Whitaker et al. (2014) main sequence for this 
redshift (dark grey line in Fig.|4j;). Panel d of Fig.|4]shows the 
dust content of the galaxies, as parameterized by both the ratio 
of the IR and UV luminosities and the rest-frame V -J color. 
Galaxies in the upper right part of this panel are very dusty, 
with the re-radiated IR emission exceeding the UV emission 
by a factor of > 100. The median Lir/Luv ratio of the parent 
sample is (Lir/Luv) = 64. The median ratio for the galaxies 
in the spectroscopic sample is slightly lower, at 42. We only 
have a few spectroscopic objects in this part of the diagram, 
and all four spectroscopic failures are located here. We infer 
that the most likely explanation for the failures is that the Ha 
emission in these galaxies is too obscured for a detection in 
our current observations. 

The Keck spectra of the 20 galaxies that we observed are 
shown in Eig. |5] The galaxies are ordered by the measured 
velocity dispersion (see below). We include the five objects 
from Barro et al. (2014b) that satisfy our selection criteria; 
as we cannot show the spectra of these objects in Eig.|3 we 
instead show models that are based on their published best¬ 
fitting parameters. 

Eigures |6] and [T] show the HST images and the rest-frame 
UV - near-IR spectral energy distributions (SEDs) of the 25 
galaxies of Eig. |5] The H\^q images are shown separately at 
high dynamic range in Appendix A. The SEDs range from 

Somewhat amazing really, particularly when considering that only a 
handful of these objects had a previously measured secure redshift from the 
ground or the grism. 
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Figure 4. Comparison of objects with near-IR spectra to the parent population of compact, massive star forming galaxies at 2.0 < z < 2.5. 
Panels show the size-mass relation (a), the UVJ diagram (b), the star formation - mass relation (with the Whitaker et al. (2014) “main 
sequence” indicated) (c), and the relation between Lir/Luv to the rest-frame V — J color (d). Solid blue symbols are objects in the sample 
described here. Open symbols are galaxies from Barro et al. (2014b) that fall in our selection box. Grey points are observed galaxies whose 
spectrum did not show any clear features. 


relatively unobscured (COSMOS_1014) to extremely dusty 
(e.g., GOODS-N_774). Some have excess emission in the 
IRAC bands (UDS_42571; see, e.g., Mentuch et al. 2009) 
Two galaxies show clear signs of merging: COSMOS_l 1363 
is an ongoing merger between two compact massive galaxies 
that are only 076 apart, and GOODS-S_30274 is probably a 
merger remnant (see Sect. O- Interestingly, there is no clear 
relation between the measured velocity dispersion and either 
the morphology or the SED. Phrased differently, it is not pos¬ 
sible to predict the Hct line width based on the information 
shown in Figs. |6] and [T] 


3.4.1. Fitting 

The spectra were htted with a model that has the redshift, 
the continuum level, the [Nil] and Ha line fluxes, and the 
line width as free parameters. The instrumental resolution is 
explicitly taken into account. The model has the following 
form: 

M(X)=L(X)*R(X) + C, (8) 

with L(X) the model for the line emission, R(X) the instrumen¬ 
tal resolution, C the continuum level, and * denoting convolu¬ 
tion. The instrumental resolution is modeled with a Gaussian: 


3.4. Redshifts, Fluxes, Line Widths, and Line Ratios 


R(X) = 


AA 


V^^TTfJinstr 


exp 



'^cei 

^instr 



(9) 
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Figure 5. Spectra of the 20 sCMGs in our sample with 2.0 < z < 2.5. Red lines show best-fitting models, as determined with the emcee code 
(Foreman-Mackey et al. 2013). We also show the best-fitting models of five galaxies from Barro et al. (2014b) that satisfy our selection criteria 
(red lines without data); these objects are included in our analysis. The galaxies are ordered by their observed line widths, which range from 
~ 50 km s“' to ~ 700 km s“'. 


with CTinstr measured from sky lines in the vicinity of the red- 
shifted Ha line, AA the pixel size in A, and Acen the center of 
the fitting range. Expressed as a velocity, the resolution of the 
MOSFIRE spectra is Ri 35kms“', and the resolution of the 
NIRSPEC data is r; 80 km s“'. The lines are parameterized as 
follows: 


^(•^) -/HaT6563(A) + /[NlI] ( T6584(A)-f-L6548(A), ) (10) 


with 


AA / 

Lao(A) = — exp -0.5 

^L'KU \ 


A-(l+z)Ao Y 


( 11 ) 


Here / is the line strength, tr is the galaxy-integrated line-of- 
sight velocity dispersion, Aq is the rest-frame wavelength of 
the line (with Aq = 6562.8 and Aq = 6548.1, 6583.6 for Ha 
and the two [Nil] lines respectively), and z is the redshift. 

Some galaxies show evidence for multiple velocity compo¬ 
nents (e.g., COSMOS_1014). We do not attempt to separately 
fit broad and narrow velocity components to these galaxies (as 
was done by, e.g., Forster Schreiber et al. 2014). As discussed 
later, broad components could indicate the presence of winds 
but could also indicate rapidly rotating gas at small radii in 
the galaxies. In the absense of high spatial resolution data, it 
is difficult to distinguish these possibilities; we therefore sim¬ 
ply interpret the Ha-luminosity-weighted velocity profiles in 
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Figure 6. HST images of the galaxies of Fig. [5] created from the WFC3 //i6o, Jns and summed ACS Vfsoe+fsM images. Each image is 4”8 x 4”8, 
corresponding to approximately 40kpc x 40kpc. The Ffieo magnitudes and circularized effective radii are listed in the images. Note that the 
galaxies were selected to be compact in mass, and are not necessarily compact in light. There is generally little evidence for spiral arms, 
star forming clumps, or other structure. Two galaxies show evidence for past (GOODS-S_30274) and ongoing (COSMOS_l 1363) mergers. 
The galaxies are ordered by their Ha velocity dispersion, as in Fig.jS] There is no clear relation between FIST morphology and Fla velocity 
dispersion in this sample. 


this paper. It should be noted that the formal uncertainties un¬ 
derestimate the error in the velocity dispersion if the velocity 
distribution is not Gaussian. This is particularly important for 
galaxies with a high S/N ratio, such as COSMOS_12020. 

The emcee MCMC algorithm (Foreman-Mackey et al. 
2013) was used to fit this model to the galaxy spectra. The fit 
was done over the wavelength region (1 +z)A 6548 -200 < A < 
(1 +z)A 6584 + 200; the results are not dependent on the choice 
of fitting region as long as the continuum is reasonably well 
covered. Priors are top hats with boundaries that comfortably 
encompass the fitting results. That is, the Bayesian aspects of 


emcee were essentially turned off. We used 100 walkers and 
generated 500 chains in each fit. Burn-in was typically fast, 
but we removed the first 200 chains when calculating errors. 
For each fit parameter the best fit is defined as the median of 
the 300 remaining samples. Errors were determined from the 
16* and 84* percentiles (see Foreman-Mackey et al. 2013, 
for details). The best fit models are shown by re d lines in Fig. 
|5] Residuals from the fits are shown in Fig. IB II As discussed 
in Appendix IB] the residuals are consistent with the expected 
noise in almost all cases. 
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Figure 7. Restframe UV to near-IR spectral energy distributions of the galaxies of Fig.[^ The red spectra are the best-fitting EAZY (Brammer 
et al. 2008) models; open red circles show the model fluxes in the observed filters. The SEDs show a large variety, ranging from blue, relatively 
unobscured emission (COSMOS_1014) to very red SEDs with high inferred dust content (e.g., UDS_42571 and GOODS-N_774). As in Fig. 
there is no obvious relation between the SEDs of the galaxies and the measured velocity dispersions of their ionized gas. 


3.4.2. Calibration 

The redshifts and velocity dispersions follow directly from 
the MCMC fit, but the line fluxes, equivalent widths, and line 
ratios need to be calibrated or corrected. The continuum is 
detected for every galaxy, which makes it possible to calculate 
equivalent widths directly from the spectra. The equivalent 
widths, in turn, enable us to calibrate the line fluxes using the 
known fir-band magnitudes of the galaxies. The equivalent 
width of Ha in the observed frame is given by 


The second term is a correction for the underlying stellar 
continuum absorption, which has a non-negligible effect on 
the measured equivalent widths and line ratios in our sam¬ 
ple. We adopt EWna.abs = 3 A (Moustakas & Kennicutt 2006; 
Alonso-Herrero et al. 2010). The relation between rest- 
frame equivalent width and the observed equivalent width is 
EW|^^ =EWHa/(l +z). The mean rest-frame equivalent width 
in our sample is (EW^^) = 71 A, consistent with the general 
population of (detected) massive star forming galaxies at these 
redshifts (Eumagalli et al. 2012). The [Nll]/Ha ratio, cor- 


EWHa = AA^+EWH„.abs(l+z). 


(12) 
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reeled for absorption, is 

EWL-EWHa,abs 

Ha /Ha "" EW°„ 

with / taken from the MCMC fit. Note that we use posi¬ 
tive values for both absorption equivalent widths and emis¬ 
sion equivalent widths in these expressions, as “absorption” 
here is more accurately described as “emission that is filling 
in the underlying absorption line”. 

The line flux is calculated from the observed equivalent 
width and the K magnitude using 

fka = 1.02 X 10-'^ X X IQ(K,-22)I-2.5 ^ (^ 4 ) 

with Ks the AB magnitude of the object and F in units of 
ergs s“' cm“^. This expression ignores small differences be¬ 
tween the filters used in each field as well as the detailed 
shape of the continuum within the Ks filter. We verified that 
the transmission at the observed wavelenghts of the lines is 
within ^ 5 % of the central transmission of the filter in all 
cases. Finally, the line luminosity is calculated using 

LHa = l-20x lO^^xD^Fna, (15) 

with D the luminosity distance in Mpe and L in ergs s“'. The 
results for all galaxies are listed in Table 2. The error bars re¬ 
flect the (propagated) MCMC errors; no additional calibration 
uncertainty was included in the error budget. 

3.5. Comparison to Barro et al. 

There are seven galaxies in the Barro et al. (2014b) 
sample that satisfy our more restrictive selection criteria. 
Two of these seven galaxies, COSMOS_12020 and GOODS- 
S_37745, are also in our sample: COSMOS_12020 was ob¬ 
served with NIRSPEC and GOODS-S_37745 with MOS- 
FIRE. For COSMOS_12020 we find a = 719!30kms-‘ and 
[Nll]/Ha = 1.39 ± 0.23, whereas BaiTO et al. have a = 352 ± 
213kms“' and [Nll]/Ha = 0.25 ±0.25. The kinematics of this 
galaxy are v ery c omplex, and a Gaussian is a poor fit (see Fig. 
in and Sect. I9.21 l: this probably explains the differences be¬ 
tween the two measurements and the large uncertainty in the 
Barro et al. values. As noted in Sect. 3.4.1 the formal uncer¬ 
tainty in our measurement of this galaxy is smaller than the 
true uncertainty, as it does not take deviations from a Gaus¬ 
sian into account. Given that a Gaussian is clearly a poor 
fit, the velocity dispersion of this galaxy is not well deter¬ 
mined. For GOODS-S_37745 we find cr = 163;!;24kms“' and 
[Nll]/Ha = 0.65 ± 0.23, compared to cr = 197 ± 37 km s“' and 
[Nll]/Ha= 0.77 ±0.30 in Barro et al. (2014b). These values 
are in agreement within the (relatively large) Icr uncertainties. 

For the two galaxies that overlap we use our own measure¬ 
ments. The other five galaxies from Barro et al. are added 
to our sample (see Tables 1 and 2). We do not have mea¬ 
surements of the line flux or spatial extent of the emission 
line gas for these objects, but they are included in the analysis 
whenever only the redshift, velocity dispersion, or line ratio is 
needed. They are shown in Fig.|5]by their best-fitting models. 
The total number of sCMGs at 2.0 < z < 2.5 that are studied 
in this paper is 25. 

4. INTERPRETATION OF THE LINE RATIOS AND 
LUMINOSITIES 

4.1. Line Ratios 


Considering that the 25 sCMGs of Fig. |5] were selected in 
a very restricted region of parameter space, their emission 
lines show a surprisingly large range of properties. The ve¬ 
locity dispersions range from 50kms“' to > 500km s“/ the 
[Nll]/Ha ratios from 0.2 to > 2, and the Ha line luminosities 
from 1.3 X 10^^ Lq to 1.2 x 10^^ Lq. Two of these param¬ 
eters, the [Nll]/Ha ratio and the velocity dispersion, show a 
significant correlation: as shown in Fig. [8] galaxies with the 
highest velocity dispersions tend to have the highest line ra¬ 
tios. The correlation has a formal significance of > 99 %. The 
broken line is the best fit relation, which has the form 

log^ = (-0.51 ±0.08)+ (1.0±0.2)log(^) . (16) 

The canonical high-metallicity saturation value for [Nll]/Ha 
in low redshift star forming galaxies is ^ 0.4 (e.g., Baldwin, 
Phillips, & Terlevich 1981; Denicolo, Terlevich, & Terlevich 
2002; Pettini & Pagel 2004; Kewley et al. 2013). Although 
this limit is observed to be higher at z > 2 (e.g., Brinchmann, 
Pettini, & Chariot 2008; Steidel et al. 2014; Shapley et al. 
2015), values of [Nn]/Ha > 1 are extreme at any redshift (see, 
e.g., Leja et al. 2013b; Shapley et al. 2015). A likely expla¬ 
nation for the highest cr, highest [Nll]/Ha galaxies in Fig.[8]is 
that shocks (Dopita & Sutherland 1995) and/or emission from 
AGNs (Kewley et al. 2013) are responsible for the line ratios. 



Figure 8. Relation between [Nll]/Ha ratio and Ha velocity disper¬ 
sion for the 25 sCMGs. There is a significant correlation, such that 
galaxies with higher velocity dispersions have higher [Nll]/Ha ra¬ 
tios. Orange symbols are galaxies with X-ray-identified AGN. The 
four galaxies with the highest observed dispersions are all X-ray 
AGN, as are five of the six galaxies with the highest [Nll]/Ha ratios. 
The black point with [Nll]/Ha= 0.3 and g = 352 km s“* is GOODS- 
N_774, which was previously published in Nelson et al. (2014). 

This is supported by the X-ray luminosities of the objects, 
obtained from all public catalogs in the CANDELS fields 0 

The catalogs were searched using the tools of the NASA 
High Energy Astrophysics Science Archive Research Center 
I http://heasarc.gsfc.nasa.gov/ 1 . We note, however, that the X-ray cov¬ 
erage in the CANDELS helds is not uniform. 
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Table 2 

Properties of Star Forming Compact Massive Galaxies^ 


idb 

Z 

^stars 

10 ^‘Mo 

kpc 

n 

1 

SFR" 
M© yr-' 

log^ 

^UV 

X-ray 

instr 

AEGIS_9163 

2.445 

0.8 

0.9 

5.4 

0.72 

131 

1.81 


NIRS 

AEGIS_26952 

2.097 

1.1 

1.8 

3.6 

0.64 

148 

1.62 

yes 

NIRS 

AEGIS_4in4 

2.332 

0.5 

0.2 

8.0 

0.62 

95 

1.38 


NIRS 

COSMOS_I63 

2.312 

0.8 

1.1 

2.5 

0.60 

336 

2.25 

yes 

MOSF 

COSMOS_1014 

2.100 

0.5 

0.7 

8.0 

0.79 

150 

0.93 


NIRS 

COSMOS_11363 

2.096 

1.1 

2.1 

5.2 

0.76 

169 

1.31 

yes 

NIRS 

COSMOS_12020 

2.094 

2.0 

2.1 

5.7 

0.57 

185 

1.96 

yes 

NIRS 

COSMOS_22995 

2.469 

1.2 

1.1 

2.8 

0.67 

188 

1.41 

yes 

NIRS 

COSMOS_27289 

2.234 

1.3 

2.3 

3.3 

0.81 

398 

2.02 


NIRS 

GOODS-N_774 

2.301 

1.0 

1.0 

2.9 

0.59 

150 

2.07 


NIRS 

GOODS-N_6215 

2.321 

1.8 

1.8 

2.6 

0.72 

110 

1.28 

yes 

MOSF'* 

GOODS-N_13616 

2.487 

1.1 

1.9 

5.6 

0.97 

130 

1.79 


MDSF** 

GOODS-N_14283 

2.420 

0.9 

1.2 

2.7 

0.86 

111 

1.43 

yes 

MOSF'* 

GOODS-N_22548 

2.330 

1.0 

1.7 

5.9 

0.78 

120 

1.53 

yes 

MDSF** 

GOODS-S_5981 

2.253 

0.8 

0.8 

4.4 

0.85 

206 

1.75 


MOSF 

GOODS-S_30274 

2.226 

1.4 

2.5 

8.0 

0.46 

404 

1.47 

yes 

MOSF 

GOODS-S_37745 

2.432 

0.9 

0.6 

3.6 

0.94 

118 

1.04 


MOSF 

GOODS-S_45068 

2.453 

1.1 

1.3 

4.9 

0.97 

139 

1.57 


MOSF"* 

GOODS-S_45188 

2.407 

0.7 

1.4 

4.3 

0.90 

134 

1.66 

yes 

NIRS 

UDS_I6442 

2.218 

1.7 

3.3 

1.6 

0.52 

176 

2.36 


MOSF 

UDS_25893 

2.304 

0.6 

0.2 

8.0 

0.92 

73 

1.88 

yes 

MOSF 

UDS_26012 

2.321 

1.3 

2.6 

3.5 

0.73 

109 

1.47 


MOSF 

UDS_33334 

2.290 

0.7 

1.4 

2.4 

0.56 

13 

1.01 


MOSF 

UDS_35673 

2.182 

0.9 

0.7 

6.4 

0.75 

492 

2.18 


MOSF 

UDS_42571 

2.292 

1.6 

2.3 

1.9 

0.82 

388 

2.39 

yes 

NIRS 


^ Uncertainties do not include possible effects of non-Gaussian velocity distributions. 
^ Id number in Skelton et al. (2014). 

Star formation rate from UV+IR emission. 

^ Velocity dispersion and [NiiJ/Hq from Barro et al. (2014b). 
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Twelve of the 25 sCMGs (48 %) have Lx > 10"^^ ergs s * and 
are classified as AGN. The X-ray luminosities range from 
Lx = 1.4 X lO^^ergss-' for GOODS-S_30274 to Lx = 6 x 
lO'^^ergss”' for COSMOS-11363. This high AGN fraction 
is consistent with previous studies of massive star forming 
galaxies at these redshifts (e.g., Papovich et al. 2006; Daddi 
et al. 2007; Kriek et al. 2007; Barro et al. 2013; Forster 
Schreiber et al. 2014). The four galaxies with the highest 
velocity dispersions are all classihed as X-ray AGN[3 Their 
kinematics are complex (see Fig.|5]l, and their [Nll]/Ha ratios 
range from 0.8 to 2 . 2 . It is likely that the observed emis¬ 
sion line properties of these galaxies are affected by the pres¬ 
ence of the AGN, either directly through emission from the 
broad line region or indirectly through AGN-driven winds 
(see Forster Schreiber et al. 2014; Genzel et al. 2014a). 

Flowever, it is not clear whether AGNs or winds dominate 
the observed, galaxy-integrated kinematics, even for these 
four objects - and whether the presence of a central point 
source influenced their selection as apparently compact, ap¬ 
parently massive galaxies. As shown in Fig.|7]the UV - near- 
IR SEDs of all galaxies are well ht by stars-only models. Most 
galaxies have strong Balmer breaks (including the most pow¬ 
erful X-ray source in the sample, COSMOS-11363), and as 
discussed in Kriek et al. (2007) and later studies (e.g., Marsan 
et al. 2015) this strongly constrains the contribution of con¬ 
tinuum emission from an AGN at Arest 4000 A. As we show 

The correlation between [Nu]/Ha and a is no longer significant when 
these four objects are removed. 


below and in the following section, the properties of most of 
the galaxies can be understood in a model where AGN are 
present but do not dominate the kinematics, line ratios, line 
luminosities, or morphology. 

4.2. Star Formation Rates 

The Her luminosities can be converted to star formation 
rates if it is assumed that the Ha emission largely originates 
in Hll regions. By comparing these star formation rates to 
those derived from the UV and the bolometric UV-i-lR lumi¬ 
nosities we can assess whether this assumption is reasonable, 
and also constrain the amount of obscuration in the galaxies. 
The Ha star formation rates were determined using the Ken- 
nicutt (1998) relation, converted to a Chabrier (2003) IMF[3 
The UV luminosities come from the best-htting Brammer 
et al. (2008) models at Arest = 2500 A, and the IR luminosi¬ 
ties are converted Spitzer/MIPS 24 fim fluxes (see Whitaker 
et al. 2012 and Sect. 2.1). 

The relation between the UV/UVh-IR star formation rates 
and the Ha star formation rate is shown in Fig.|9] Only the 20 
galaxies from our own spectroscopy are considered here, as 
we do not have self-consistent measurements of L(Ha) for the 
five objects from Barro et al. (2014b). The Ha star formation 
rates range from 6 M 0 /yr - 58MQ/yr. They correlate with 
the UV star formation rates (98 % significance) and with the 
UV-blR star formation rates, which are dominated by the IR 

For consistency with previous studies we use a Chabrier (2003) IMF as 
the default, even though these galaxies may have a more bottom-heavy IMF 
(see, e.g., Conroy & van Dokkum 2012). 
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(96 % significance). The mean offset between SFR(Ha) and 
SFR(UV) is 0.47 ± 0.06 dex, with an rms scatter of 0.22 dex. 
The offset between SFR(Ha) and SFR(UV+IR) is -1.00 ± 
0.09 dex, with a scatter of 0.27 dex. The implication is that 
the Ha emission misses ^ 90 % of the star formation, and the 
UV misses ^ 97 %. The ratios between the three indicators 
are broadly consistent with expectations from a Calzetti et al. 
(2000) reddening curve, if there is ^ 50 % more dust toward 
nebular emission line gas than toward the UV continuum^ 



SFR (Ha) [Mg/yr] 

Figure 9. Relation between the star formation rate derived from Ha 
and the star formation rate derived from the UV (blue points) and 
UV+IR (black points with errorbars). X-ray AGN are indicated with 
orange centers. The Ha star formation rates fall in between the UV 
and UV-l-IR ones, as expected from the effects of dust extinction. 
The obscuration toward Ha is a factor of 10, with a scatter of only a 
factor of 2. The X-ray sources are indistinguishable from the other 
galaxies. 

The X-ray AGNs are indicated by orange points in Fig. |9l 
Remarkably, they are indistinguishable from the other objects; 
they span the same range in Ha luminosity, and they follow 
the same relations with the UV and UV-i-IR luminosities. The 
offsets between the AGN and non-AGN are consistent with 
zero. This suggests, but does not prove, that the Ha, UV, 
and IR luminosities of most galaxies are dominated by star 
formation. 

5. INTERPRETATION OE THE VELOCITY 
DISPERSIONS 

5.1. Are the Gas Dynamics Similar to the Stellar Dynamics 
of Compact Quiescent Galaxies? 

The velocity dispersions we measure come from Gaussian 
fits to the galaxy-integrated, luminosity-weighted Ha line 
profile and are equivalent to the second moment of the ve¬ 
locity distribution of the gas. They should not be confused 
with the rotation-corrected gas dispersions within spatially- 
resolved disks, such as discussed by, e.g., Kassin et al. (2012) 
and Forster Schreiber et al. (2014). The measured dispersions 

We refer to other studies for more detailed analysis of the attenuation 
toward HII regions (e.g., Price et al. 2014, Reddy et al. 2015). 


are a complex function of the dynamics and gas distribution 
in the galaxies: 

o-gas ^ a^KotSin^(0 + cri2sM + >v^(0o-Ld: (17) 

with a ^ 0.8 (Franx 1993; Rix et al. 1997; Weiner et al. 2006; 
see also Appendix C), i the inclination of the galaxy (i = 0° is 
face-on, and i = 90° is edge-on), ctism the galaxy-integrated 
dispersion within the gas clouds, and w(/)crwind an inclination- 
dependent term that takes non-gravitational motions into ac¬ 
count. A further complication is that Eq.[T7]is the result of an 
integral over the area of the galaxy that falls within the slit, 
weighted by the spatially-varying luminosity of the Ha line. 

We first assume that the gas in the sCMGs “behaves” in a 
similar way as the stars in qCMGs. That is, we assume that 
the stars in qCMGs were formed directly out of the (detected) 
gas in sCMGs, such that they have the same distribution and 
kinematics. This has been assumed in previous studies of the 
kinematics of compact massive star forming galaxies (Nel¬ 
son et al. 2014; Barro et al. 2014b) and it may be reasonable 
if many compact, massive quiescent galaxies a re d irect de¬ 
scendants of the sCMGs. As discussed in Sect. 12.41 the stel¬ 
lar velocity dispersions of quiescent galaxies can be predicted 
from their stellar masses and effective radii (e.g., Taylor et al. 
2010a; Bezanson et al. 2011; Belli et al. 2014b). Figure [T^ 
shows the relation between Ugas and the predicted velocity dis¬ 
persion. The predicted dispersions were calculated using the 
Sersic-dependent relation Eq.|6] 

There is no significant correlation between Cgas and Cpred, 
for either the full sample or the sample with the X-ray AGN 
removed. The rms scatter in crgas/upied is 0.26 dex. Given 
that we are ignoring the effects of non-gravitational motions, 
it is striking that many galaxies have lower velocity disper¬ 
sions than the expectations. The mean offset is -0.08 dex for 
the full sample, and -0.16 dex when the AGN are excluded. 
These results stand in sharp contrast to the stellar velocity dis¬ 
persions of quiescent galaxies. Red squares are seven galaxies 
with 2 < z < 2.5 and measured dstars, fe, n, and Mjtars from van 
Dokkum et al. (2009), van de Sande et al. (2013), and Belli 
et al. (2014b). They have a mean offset in iTstars/^pi-ed of -1-0.05 
dex and an rms scatter of only 0.03 dex. 

As dynamical mass is proportional to the offsets of the 
sCMGs are even more dramatic in Fig. [TOb . which shows the 
relation between dynamical mass and stellar mass. Here dy¬ 
namical mass was calculated using 


<y„ = 


^(«)0-0bs'‘e 


(18) 


as derived by Cappellari et al. (2006) and following stud¬ 
ies of quiescent galaxies at high redshift (e.g., van de Sande 
et al. 2013). For sCMGs dobs = o’gas and for quiescent galax¬ 
ies dobs = o’stars- Note that, given our definition of dpred (see 
Eq. nil, panels a and b of Fig. [TO] are two different ways of 
presenting the same information. The mean mass offset of 
the sCMGs is -0.16 dex for the full sample, and -0.32 dex 
for the sample with the AGN removed. That is, the dynami¬ 
cal masses of the non-AGN galaxies are on average a factor 
of two lower than the stellar masses. Several of the galax¬ 
ies have apparent dynamical masses that are a factor of > 10 
lower than their stellar masses. Again, the quiescent galaxies 
show a tight relation in Fig.llOb. with a mean offset of -l-O.l 
dex. 

We conclude that the gas dynamics of sCMGs are not sim¬ 
ilar to the stellar dynamics of quiescent galaxies in the same 



























14 




Figure 10. a) Comparison of observed and predicted velocity dispersions. The predicted dispersions are calculated from the stellar mass, the 
half-light radius, and the Sersic index. Red squares are quiescent galaxies at 2 < z < 2.5 from van Dokkum et al. (2009), van de Sande et al. 
(2013), and Belli et al. (2014b). Points with errorbars are the 25 sCMGs; orange centers indicate galaxies with X-ray AGN. h) Comparison 
between dynamical mass and stellar mass. The galaxies show a very large range, and the dynamical masses often appear to be lower than the 
stellar masses. The gas in sCMGs does not have the same distribution and/or kinematics as the stars in qCMGs. 


mass and redshift range. The stellar masses and sizes are not 
useful indicators of the observed gas velocity dispersions; in 
fact, the observed [Nll]/Ha ratio is a better predictor of the 
observed Ha linewidth of a galaxy than its compactness is. 
There are many ways to increase the velocity dispersion of a 
galaxy so it falls above the lines of equality in the two panels 
of Fig. m the broad line region of an AGN, AGN-induced 
winds, and supernova-driven winds can all lead to broad Ha 
lines (e.g., Westmoquette et al. 2009; Le Tiran et al. 2011; 
Forster Schreiber et al. 2014; Banerji et al. 2015). This is 
likely the case for several galaxies in the sample: the four 
galaxies with the largest dynamical masses are all X-ray AGN 
with [Nll]/Ha ratios in the range 0.8-2.2. However, it is dif¬ 
ficult to decrease the observed dispersion. Setting aside the 
possibility that the stellar masses of some galaxies could be in 
error by a factor of ^ 10, this is only possible if the detected 
ionized gas is sCMGs is distributed very differently from the 
stars in quiescent galaxies. As we show below, there is strong 
evidence that this is indeed the case. 

5.2. Evidence for Rotating Gas Disks 

A possible interpretation of the large range of velocity dis¬ 
persions is that the dynamics are dominated by rotation, and 
we are seeing disks under a large range of viewing angles. 
In Fig. [TTk we show the distribution of projected axis ra¬ 
tios q = b/a 'vi\ our sample, as determined from the //leo data 
(see van der Wei et al. 2014b). The axis ratios of the 25 
galaxies are inconsistent with a uniform distribution, which 
would be expected for thin, randomly oriented disks. We 
find no galaxies with q' < 0.4 and the distribution peaks at 
q ^ 0.75. The distribution is consistent with that observed 
for qCMGs, shown by the red line in Fig. fTTk : according to 
the Kolgomorov-Smirnov test the probability that both sam¬ 
ples were drawn from the same distribution of axis ratios is 
27 %. The distributions are also consistent with results for the 
general population of massive galaxies at z ^ 2 (Chang et al. 


2013; van der Wei et al. 2014a). We note that we do not 
detect a significant wavelength dependence of the mean axis 
ratio of the 25 sCMGs: we find {q) = 0.76 ± 0.03 in J\ 2 s and 
{q) = 0.74 zb 0.03 in Hycq. 

Even though the stars are not in thin disks, the gas can be. If 
the gas is in rotationally-supported disks that are aligned with 
the stellar distribution, the measured velocity dispersions are 
expected to show an anti-correlation with the observed axis 
ratios of the galaxies. As shown in Fig. [12^ we see precisely 
this effect: there is an anti-correlation, with a correlation co¬ 
efficient of-0.38 and a significance of 94%. This is strong 
evidence that the gas is in disks and that the measured dis¬ 
persions are dominated by gravitational motionsE] This anti¬ 
correlation is not consistent with M82-style galactic winds: 
outflows that are perpendicular to the disk lead to the high¬ 
est observed velocities (and hence integrated velocity disper¬ 
sions) when the disk is viewed face-on. 

Going back to Eq. [17] we now assume that ctism and dwind 
can be neglected, so that 




^gas 

asin“'(/) 


(19) 


To derive rotation velocities we need to determine the relation 
between inclination and axis ratio in our sample. We con¬ 
structed a model with long, intermediate, and short axes A, B, 
and C that reproduces the observed axis ratio distribution for 
random viewing angles. The orange line in Eig.fTTk shows the 
predicted distribution of q for thick disks - or oblate spheroids 
- with AlB= \ and qQ = ClA uniformly distributed between 
qo = 0.40 and qo = 0.75. This model is an excellent fi{3to the 
observed distribution of q. It should be emphasized that this is 


The correlation between a/\/M and q has slightly less scatter, and equal 
significance. 

It is well known that the axis ratio distribution by itself is insufficient to 
constrain all three axes A, B, and C (see, e.g., Franx et al. 1991). Although 
there is some evidence that the stellar distribution of compact galaxies is 
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Figure 11. (a) Distribution of axis ratios among the 25 sCMGs at 2 < 
z < 2.5. The distrihution is not uniform, and is inconsistent with thin 
disks under random viewing angles. The axis ratio distribution of 
qCMGs in the same redshift range is shown in red. The orange line is 
a model for randomly oriented ohlate objects with intrinsic thickness 
qo = ClA = 0.4-0.75. (h) The relation between median inclination 
and observed axis ratio in this model. Dotted lines indicate the ±1 ct 
spread, (c) Inclination correction as a function of observed axis ratio. 


a model for the intrinsic shapes of the stellar distribution, not 
for the gas distribution: the gas is likely in thinner disksH and 
all we assume is that the gas disks of the galaxies are aligned 
with their stellar distributions. 

For galaxies with intrinsic thickness qo the relation between 
the inclination and the observed axis ratio is given by 

cos2(0=^^. (20) 

As qo is not a constant in our model the relation between i 
and q is not single-valued. The solid line in Fig. [TTb shows 
the median relation, and the broken lines indicate the Icr scat¬ 
ter. Figure [TT]; shows the inclination correction sin“*(/) as a 
function of q. 

The inclination-corrected rotation velocities are shown in 
Fig. [T2b . They are derived from the gas velocity dispersions 
and the observed axis ratios of the galaxies using the aver¬ 
age relation infTTl: and assuming a = 0.8 ± 0.2 (see Rix et al. 
1997; Weiner et al. 2006). In Appendix C we show that this 
value is a reasonable approximation for the geometries of both 
the mass and the ionized gas that we derive in this paper. The 

oblate or disk-like rather than triaxial (e.g., van der Wei et al. 2014a; Zolotov 
et al. 2015), in our paper this is an assumption, not a result. 

Although the gas disks likely have lower C/A than the stellar distribu¬ 
tion, they are probably not as thin as disks in the local Universe (see, e.g., 
Cresci et al. 2009). 



0.4 0.6 0.8 1 


q (axis ratio) 

Figure 12. (a) Observed relation between the Ha velocity dispersion 
and the Him axis ratio. Orange centers indicate X-ray AGN. There 
is a significant (anti-)correlation between CTgas and q, as expected if 
there is a significant contribution from rotation to Cgas and the Ha 
disks are aligned with the stellar distribution. The grey line indicates 
the expected trend for rotating disks (Fig. Et). (b) Inferred rotation 
velocity versus axis ratio. The rotation velocities were corrected for 
inclination using the observed axis ratios (see text). The median ro¬ 
tation velocity is 338 km s“* for the full sample and 271 km s^' when 
AGN are excluded. 

large uncertainty reflects the fact that the conversion of dis¬ 
persion to rotation velocity depends on the spatial distribution 
of the gas, and the underlying velocity field (see Appendix 
C). Data of much higher spatial resolution and S/N ratio are 
needed to measure a directly for these extremely compact 
galaxiesEB The uncertainty in a and 50% of the (logarith¬ 
mic) inclination correction were added in quadrature to the 

For completeness, we note the interesting possibility that the two peaks 
in the spectra may not be Ha and [Nil] but two narrow peaks in a “double- 
homed” Ha profile that happen to have exactly the separation of Ho and 
[Nil] A6584. This may happen when the Ha emission originates from a 
narrow ring rather than a disk. In most cases that interpretati on ca n readily be 
ruled out, from the spatially-resolved line profile (see Sect. 16:21 or from the 
detection of the [Nil] A6548 line, but in a few cases (e.g., AEGIS_4ni4) it 
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error budget. The median rotation velocity for the full sam¬ 
ple is (Vrot) = 339kms“'. Excluding the X-ray AGN we find 
(Trot) =271 km s“'. 

If it is assumed that is not only the half-light radius in the 
//i6o band but also the half-light radius of the Ha emission, 
we can define the dynamical mass as 

Ml, = 2^. ( 21 ) 

This is not a true total mass but simply twice the enclosed 
mass within the half-light radius. In Fig. [13] this dynamical 
mass is compared to the stellar mass. Although the inclina¬ 
tion corrections have lessened the offsets of the most extreme 
outliers, it is clear that orientation effects are not sufficient to 
explain the relatively low velocities that have been measured 
for a large fraction of the sample. The mean offset for the 
whole sample is -0.19 dex, and the scatter is 0.55 dex. In the 
next Section we show that variation in the spatial extent of the 
ionized gas with respect to that of the stars is a likely source 
of both the offset and scatter in Fig. [13] 


So far we have assumed that the spatial extent of the gas 
is similar to that of the stars, that is, ^ rjtars = where 
Tgas is the half-light radius of the measured Ha distribution]^ 
There is no a priori reason why this should be the case; e.g., 
in the models of Zolotov et al. (2015) compact galaxies of¬ 
ten have rings of gas and young stars around their dense cen¬ 
ters, which originate from ongoing accretion from the halo. 
Furthermore, as shown earlier ~ 90 % of the star formation 
in sCMGs is obscured, and the extinction is expected to be 
particularly high toward the central regions (e.g., Gilli et al. 
2014; Nelson et al. 2014). The distribution of detected Ha 
emission may therefore be less centrally concentrated than the 
distribution of star formation. 

The radius of the gas disks can be inferred from Vrot if we 
assume that the observed velocity is the circular velocity of 
the stellar body at the radius of the gas. The gas radius then 
depends on Vrot, the stellar mass, and the structural parameters 
of the galaxy: 

G 

^gas ^ ■n^/(^gas)44stars) (22) 

M-ot 



Figure 13. Relation between dynamical mass and stellar mass, with 
dynamical masses calculated from the inclination-corrected rotation 
velocities and the stellar half-light radii. Most galaxies fall below the 
line of equality. 


6. SPATIAFFY-EXTENDED GAS DISKS 
6.1. Inferred Sizes of Gas Disks 

In the previous Section we showed that many galaxies have 
galaxy-integrated velocity dispersions that are much smaller 
than expected from their stellar masses and sizes. As demon¬ 
strated in Sect. l5.2l this is partly caused by the sin(/) reduction 
of the velocity of rotating disks. However, even after correct¬ 
ing the observed dispersions to inclination-corrected rotation 
velocities the dynamical masses are typically lower than the 
stellar masses, particularly for galaxies that do not host an X- 
ray AGN. 

is difficult to exclude this possibility without observing other emission lines. 


with Vrot the inclination-corrected rotation velocity and/(rgas) 
a function that depends on the mass distribution of the galax¬ 
ies: 


flM(r)2nrdr 

^ rWwrdr- 


(23) 


Here /(r) is the best-fitting Sersic profile to the light distribu¬ 
tion. For Tgas = Tstars (= ^e), /(^gas) = 0.5 and Eq.[22]is equiv¬ 
alent to Eq. [2T] with M^y, = Mstars- These expressions ignore 
the fact that the 2D radii are not identical to the 3D radii, as¬ 
sume that the stellar mass distribution can be approximated by 
the i/i6o luminosity distribution, and assume that the contribu¬ 
tions of gas and dark matter to the total mass can be neglected 
on the scales that are probed by the Ha emission. 

Solving Eq. [22] numerically, we find that the inferred gas 
disk sizes range from ~ 0.2kpc to > lOkncPi This large 
range is not surprising, as it is effectively an interpretation 
of the large variation that is seen in Fig. [13] Figure [14] 
shows the relation between inferred rg,s and the stellar effec¬ 
tive radius. The gas radii are typically larger than the stellar 
radii, particularly for the galaxies that do not have an X-ray 
AGN (black points). The ratio between the gas radius and 
the stellar radius is shown explicitly in the bottom panel of 
Fig.m The mean ratio, calculated with the biweight statis¬ 
tic (Beers et al. 1990), is log rgas - log rstars = O.lSzbO.lO for 
the full sample. Excluding galaxies with an AGN, we find 
log Tgas - log Tstars = 0.37zb0.14. That is, the gas disks are a 
factor of ^ 2.3 more extended than the stellar distribution. 
This is strictly a lower limit, as it is assumed that only stars 
contribute to the stellar mass, the galaxies have a relatively 
“light” Chabrier (2003) IMF, and there are no contributions 
from non-gravitational motions to the measured velocity dis¬ 
persions. 


That is, the distribution of the ionized gas, with no extinction corrections 
applied. Measuring the tme ‘Vgas” requires molecular line measurements with 
high spatial resolution. 

We note that there are two solutions to Eq.[22| as the gas could in princi¬ 
ple also be located in the inner < 50 pc where the rotation curve is still rising 
(see, e.g.. Fig. 18). This is unlikely given that the galaxies have, by selec¬ 
tion, star forming SEDs with a spatial extent of ~ 1 kpc. Furthermore, as we 
show later, the large radius solutions are corroborated by the measured spatial 
extent of the Ha emission. 
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extends to larger radii than the stars in these galaxies. We 
emphasize here that we do not attempt to measure rotation 
curves directly from these velocity gradients, as this can only 
be done reliably when the sizes of galaxies are similar to, or 
larger than, the spatial resolution of the data (see, e.g., Vogt et 
al. 1996; Miller et al. 2011; Newman et al. 2013). 

For the nine galaxies that were observed with MOSFIRE 
we can measur e the spatial extent of the Ha emission. As dis¬ 
cussed in Sect. l3.ll a bright star was included in all MOSFIRE 
masks, and the profile of this star in the spatial direction can 
be used to approximate the PSE. We extracted spatial profiles 
of the combined Ha and [Nil] emission for the nine galax¬ 
ies by averaging the data in the wavelength direction. Each 
column was weighted by the inverse of the noise (which is 
dominated by sky emission lines); we did not weight by the 
signal as this would bias the profile towards the central re¬ 
gions. The spatial profiles are shown in Eig. [15] (black points 
with errorbars). Each panel also shows the profile of the star 
that was observed in that particular mask (orange points); the 
EWHM of this profile is also indicated. 

The profiles were fit by a model to determine the half-light 
radii of the ionized gas. The model has the form 

M(r) = S(r)*P(r), (24) 

with r the position along the slit, E(r) the model for the one¬ 
dimensional surface brightness profile of Ha along the slit, 
P(r) a Gaussian fit to the profile of the star, and * denoting 
a convolution. The Gaussian fits to the stellar profiles are 
shown by orange lines in Eig. [T5| Parameterizing P(r) with 
the sum of two Gaussians does not improve the fit to the stel¬ 
lar profile or change the results. It is not possible to constrain 
the functional form of the surface brightness profile with our 
data. Instead, we assume that the Ha is in an exponential disk 
(see Nelson et al. 2013); 


Figure 14. Relation between inferred radius of the gas distribution 
and the stellar half-light radius. Orange points indicate galaxies with 
X-ray AGN. The gas radii were determined from the stellar masses 
and the inclination-corrected rotation velocities. There is a large 
scatter, reflecting the large scatter in Fig. 13. The ratio between the 
gas size and the stellar size is shown in the bottom panel. Non-AGN 
(black) and AGN (orange) are shown separately. The galaxies with 
AGN have, on average, compact inferred gas distributions. For the 
non-AGN (black histogram) the average spatial extent of the gas is 
~ 2.3 X larger than that of the stars. 


6.2. Measured Sizes of Gas Disks 

We can test directly whether the sCMGs are embedded in 
large gas disks by examining the observed spatial extent of 
the emission lines. Even though the galaxies were selected 
to be extremely compact, the inferred spatial extent of the 
emission line gas is so large that it should (just) be detectable 
in ground-based, seeing-limited observations. The 2D spec¬ 
tra are shown in Eig. [TSj they cover a rest-frame wavelength 
range from 6551 A to 6596 A and a spatial extent along the slit 
of ±175. The five empty panels are the sCMGs from Barro 
et al. (2014b). 

Remarkably, about 1/3 of the galaxies show velocity gradi¬ 
ents. They are most prominent in UDS_33334, UDS_26012, 
and UDS_16442, but also visible in GOODS-S_5981, 
UDS_42571, and UDS_35673. The seeing ranged from 0"6 
to > 1 f'O, and the stellar half-light radii of the galaxies are 
typically 071. Therefore, the fact that we spatially resolve the 
Ha emission immediately demonstrates that the ionized gas 


S(r) = E(0)exp(- 


1.6781 r-re, 


(25) 


Here E(0) is a normalization factor, rcen is the center of the 
profile, and rgas is the half-light radius of the ionized gas. 

We fitted this model to the data using the emcee code, 
as de scribed for the fits in the wavelength direction in Sect. 
13.4.11 Again, the priors are top hats with bounds that do not 
constrain the fits or the errorbars. Rather than rgas itself we 
fit log rgas; the error distribution of rgas is highly asymmet¬ 
ric, which means that the peak of the distribution of samples 
does not coincide with its 50* percentile. The distribution of 
the log rgas samples is symmetric. The resulting measured gas 
radii, converted to kpc, are listed in the panels of Eig. [T5| For 
seven out of nine galaxies the value of rgas is different from 
zero with > 2(j significance. 

A geometric correction needs to be applied to the measured 
values of rgas to account for the fact that the slit is typically not 
aligned with the major axis of the gas disk. This correction 
depends on the orientation of the slit and on the inclination of 
the gas disk: 

'■gas ~ [OOS^ (/) + COS^ (PAsiit - PAgal) (1 - COS^ (O) ] rgas 

(26) 

with i the inclination (as derived in Sect. 15.21) . PAjiit the po¬ 
sition angle of the slitmask, and PAgai the orientation of the 
galaxy on the sky (as determined with GALFIT). Note that 
the corrected r^as is measured along the major axis (and is not 
a circularized radius), consistent with our interpretation that 
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Figure 15. Two-dimensional MOSFIRE and NIRSPEC spectra centered on the redshifted Ha and [Nil] lines. The galaxies are ordered by their 
observed galaxy-integrated velocity dispersion, as in Eigs. [5] ID and [7] The inclination-corrected rotation velocity Vmt (in kms ') is indicated 
in each panel. At least 1/3 of the galaxies show velocity gradients, demonstrating that their ionized gas distributions are spatially resolved in 
these ground-based, seeing-limited data. Eor the nine galaxies observed with MOSEIRE the spatial extent of the gas can be measured, using 
the profile of a star (orange). Black curves are the best fitting exponential profiles convolved with the PSE. The measured half-light radii of the 
Ha emission (rgas, in kpc) are indicated. 


the gas is in thin, rotating disks. The median correction is 
small at 9 %. For GOODS-S_30274 we use the median cor¬ 
rection of the other eight galaxies, as its PA mostly reflects the 
orientation of its tidal tail. We use the corrected radii when 
comparing the measured radii to predicted radii and when de¬ 
riving the rotation curve of the galaxies in Sect. 16.41 
For three galaxies, UDS_35673, GOODS-S_30274, and 
GOODS-N_6215, we obtained an independent measurement 


of the extent of the emission line gas from their WFC3/G141 
grism spectra. These are the only galaxies in the sample 
of 25 that have grism spectra covering the redshifted [Olll] 
A4959,5007 lines and a detection of these lines with > 5a 
significance. As shown in Nelson et al. (2012) emission 
lines in grism spectra are images of the galaxy in the light 
of that line, providing direct information on the distribution 
of ionized gas at Of'14 resolution. The interpretation of the 
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[Olll] lines is complicated by the fact that the two lines are 
very close together on the detector. We fit the lines simul¬ 
taneously with GALFIT (Peng et al. 2002), keeping their 
relative location and flux ratio fixed and using a PSF gener¬ 
ated with Tiny Tim (Krist 1995). Two of the three galaxies 
(UDS_35673 and GOODS-S_30274) are also in the MOS- 
FIRE sample. The best-fit G141 [Olll] radii of these objects 
are 1.6±0.3kpc and 5.1 ± 1.5kpc, in excellent agreement 
with the MOSFIRE Ha values (1 ■3;|;j]; kpc and 3.9;|;[ 4 kpc, re¬ 
spectively). The third galaxy, GOObS-N_6215, has a G141 
[Olll] radius of 3.0 ± 1.0kpc. In the following, we show all 
twelve measurements in figures (nine from MOSFIRE, three 
from HST), with lines connecting the two independent mea¬ 
surements for UDS_35673 and GOODS-S_30274. 

6.3. Comparison of Inferred and Measured Sizes 

Eor the ten galaxies with gas size measurements we can 
directly compare the inferred sizes to the measured ones. The 
results are shown in Eig.[T 6 ] There is a clear correlation, with 
a significance of > 99 %. Eurthermore, the offset between the 
two sets of radii is small. Giving equal weight to all twelve 
measurements we find a difference of only -0.09 ± 0.07 dex. 
This excellent agreement between inferred and measured radii 
provides support to our modeling of the observed kinematics 
of sCMGs. 



Figure 16. Relation between inferred and measured half-light radii of 
the gas distribution in sCMGs. Orange points are galaxies with an X- 
ray AGN. Circles are Keck/MOSFIRE measurements of Hq; squares 
are HSTAVFC3 measurements of [Olll]. Points connected by dotted 
lines are measurements for the same galaxy. The measured sizes 
were corrected to account for the difference in orientation between 
the slit and the galaxy’s major axis. The inferred sizes are based on 
the observed velocity dispersions, axis ratios, and stellar masses of 
the galaxies, and the measured sizes are determined directly from 
the spatial extent of the emission lines. There is a strong correlation, 
with no significant offset. 


ten galaxies with measured Ha effective radii. The dynamical 
masses in the right panel were calculated using 


M 


-V 


dyn, gas — 


y2 f 

rj-QPgas 


/(rgas)G’ 


(27) 


with f(rg^s) accounting for t he ( small) fraction of the mass 
that is outside rgas (see Sect. 16. 11 1. The dynamical masses in 
the right panel are consistent with the stellar masses for all 
galaxies, although we note that the sample is small. The mean 
offset is logMjyjj gas “ log-^stars = “0.07 ± 0.08, and the rms 
scatter is 0.25 dex. 

Summarizing the results from this and the previous Section, 
we have inferred that sCMGs have rotating gas disks whose 
observed spatial extent is larger by a factor of ^ 2 than their 
stellar distribution. This is based on four related results: 1) 
Many of the galaxies have very low galaxy-integrated veloc¬ 
ity dispersions; this shows that the gas does not have the same 
spatial distribution as the stars and that galactic-scale winds 
do not dominate the kinematics for the majority of the sam¬ 
ple (Fig. [TOkl. 2) The observed dispersions display a signif¬ 
icant anti-correlation with the axis ratios of the galaxies; this 
is consistent with disks under a range of viewing angles and 
difficult to reconcile with M82-style galactic winds (Fig. [12^). 
3) Nearly all galaxies with spatially-resolved gas distributions 
show velocity gradient^ (Fig. [T^ . 4) Inferring the sizes of 
the gas disks from the inclination-corrected rotation veloci¬ 
ties, we find good agreement between the inferred sizes and 
the measured sizes (Fig. [ISll. 


6.4. Keplerian Rotation out to 7 kpc 

The measured kinematics can be used to constrain the total 
mass within ^ 7 kpc. We can derive a spatially-resolved rota¬ 
tion curve by making use of the fact that the measured spatial 
extent of the gas varies by a factor of 10 (see Fig. fTbl) . un¬ 
der the assumption that the galaxies have similar inclination- 
corrected dynamics after scaling them to a common mass. 
The validity of this approach is demonstrated in Appendix 
C, where we calculate the relation between the observed 
galaxy-integrated linewidths and the actual rotation velocity 
at r = Tgas. To bring all galaxies to the same normalization, we 
first define the scaled rotation velocity as 






\/4^stars/ (4^stars) 


(28) 


with (Mstars) = 1.0 X 10"Mq the median stellar mass of the 
full sample of 25 sCMGs. We note that this scaling does not 
change the velocities by a large amount as the galaxies in our 
sample span a small mass range. 

In Fig. [18] the scaled velocities are plotted as a function 
of the measured gas half-light radius rgas (corrected for slit 
orientation) for the 10 galaxies that have this measurement. 
The rotation curve declines: in galaxies where Ha is mea¬ 
sured at a larger distance from the center, the inclination- 
corrected rotation velocity is lower. The decline has a for¬ 
mal significance > 99 %. Falling rotation curves have been 
seen previously in some individual (large, non-compact) z ~ 2 
galaxies (e.g., the galaxies D3a6397 and zC400690 in Gen- 
zel et al. 2()14b). The solid line is the predicted rotation 


This result is presented in a different way in Fig.[T7l which 
shows the relation between dynamical mass and stellar mass. 
The left panel is identical to Fig.[T3] but here we only show the 


2^ There ai'e indications that the presence of velocity gradients anti¬ 
correlates with the axis ratio, as expected in the rotating disk interpretation, 
but larger samples with higher spatial resolution are needed to confirm this. 
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Figure 17. Dynamical mass versus stellar mass when using the stellar half-light radii (left panel) or the Ha half-light radii (right panel) to 
calculate the dynamical mass. The left panel shows the same information as Fig.[T3 but only for the ten galaxies with measured Ha radii. The 
dynamical masses derived from the gas radii are self-consistent, as the rotation velocities were measured at rgas, not rstars. 



Figure 18. “Rotation curve” for sCMGs at 2.0 < z < 2.5. Points with 
errorbars are measured inclination-corrected rotation velocities and 
measured gas effective radii of ten different galaxies. The quantities 
on the two axes are therefore independent. The velocities were cor¬ 
rected to a common mass of 10*’ Mq and the radii were multiplied 
by a factor that accounts for the slit alignment. Galaxies with or¬ 
ange centers have an X-ray AGN. The rotation curve declines, with 
> 99 % significance. The black curve is not a fit, but the expected 
rotation curve if all the mass is in the compact stellar component of 
the galaxies. This model is a good description of the data. The grey 
curve assumes that 50 % of the total mass is in the form of gas, with 
a spatial extent that is a factor of 2.5 larger than that of the stars. This 
model is inconsistent with the data. 

curve for an M = 10** Mq galaxy with the median effective 
radius (r^ = 1.3kpc) and median Sersic index (n = 4) of the 
sCMGs, calculated with Eq. |22] This model is a good de¬ 


scription of the data; = 6.5 with 12 degrees of freedom. 
The grey line is a model with two mass components: in addi¬ 
tion to the stellar component this model has a gas component 
with the same mass as the stars (i.e., the gas fraction in this 
model is /gas = Mgas/(Mstars+Mgas) = 0.5). Fof consistency 
with the previous Sections, the spatial distribution of the gas 
is assumed to be exponential with rgas = 2.5 x r^.. The grey 
line overpredicts the observed velocities; with = 30.0 this 
model can be ruled out with 99 % confidence. 

We can derive an upper limit to the gas mass within 7 kpc 
by assuming that the uncertainty in the stellar mass is small 
and allowing the mass in the gas component to vary. The 
95 % confidence upper limit to the gas mass is Mgas < 0.6 x 
10*' Mq, corresponding to a limit on the gas fraction of /gas < 
0.4. It appears that the gas is mostly a tracer of, rather than 
a contributor to, the kinematics. Finally, we derive the best 
fitting mass within r = 7 kpc by assuming that /gas = 0 and 
allowing Mstars to vary: M^t = O.S^q® x 10*’ Mq, where the 
errorbars are 95 % confidence limits. Although this estimate 
assumes that mass follows light, we verified that the results 
are very similar for more extended mass distributions. We 
conclude that the dynamical mass within r ^ 1 kpc is fully 
consistent with the stellar mass that is implied by the stellar 
population fit; and that there is little room for additional stars, 
gas, or dark matter inside this radius. 

7. ARE STAR FORMING COMPACT GAFAXIES THE 
MAIN PROGENITORS OF QUIESCENT COMPACT 
GALAXIES? 

In the previous Sections we have shown that a population 
of star forming galaxies exists at z > 2 whose dynamical mass 
within ^ 7 kpc is dominated by a massive, compact, stel¬ 
lar component. We now ask whether these galaxies can be 
progenitors of the population of massive, compact, quiescent 
galaxies, by considering their number densities, morpholo¬ 
gies, and star formation rates. This question has been dis¬ 
cussed before, by, e.g., Williams et al. (2014, 2015), Bruce 
et al. (2014), Nelson et al. (2014), Dekel & Burkert (2014), 
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Zolotov et al. (2015). Arguably the most extensive observa¬ 
tional study is a series of papers by Barro et al. (Barro et al. 
2013, 2014a, 2014b), using data over two (Barro et al. 2013, 
2014b) or one (Barro et al. 2014a) of the hve fields that we 
study here. Using our larger data set and more restrictive se¬ 
lection we hnd broadly similar results. 

7.1. Number Density Evolution 

A star forming compact massive galaxy will resemble a 
quiescent compact massive galaxy if star formation stops 
(quenching). However, the opposite is also true: a quiescent 
compact galaxy that starts forming stars due to the accretion 
of new gas (see, e.g., Zolotov et al. 2015; Graham, Dullo, 
& Savorgnan 2015) could resemble a star forming compact 
galaxy (rejuvenation). We can determine whether quenching 
or rejuvenation dominates by measuring the number density 
of sCMGs and qCM Gs a s a function of redshift. The selec¬ 
tion criteria of Sect. 12.31 were applied in small redshift bins, 
and the number density was determined by dividing the num¬ 
ber of galaxies in the bin by its volume. The result is shown 
in Fig. [19] (filled points and solid curves). 
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Figure 19. Evolution of the number density of sCMGs (blue solid 
line) and of qCMGs (red solid line). The number density of all star 
forming and quiescent galaxies with log(Mstars) > 10.6 is also shown 
(dashed lines). The data suggest that compact star forming galax¬ 
ies continuously enter the selection region from z > 2.8 to z ~ 1.8 
and quench, leading to a strong increase in the number of compact 
quiescent galaxies. When the number of sCMGs begins to decrease 
at z < 1.8, the number of qCMGs first plateaus and then drops, as 
quiescent galaxies grow in size due to mergers at 0.5 < z < 1.5. 

At2.0<z<2.5 the number densities of the t wo p opula- 
tions are very similar, as already noted in Sect. 12.41 How¬ 
ever, at higher and lower redshifts the number densities are 
different: the sCMGs have a roughly constant number den¬ 
sity from z 2.8 to z ^ 1.8, whereas the number density of 
qCMGs increases by an order of magnitude over that same 
redshift range|3 A straightforward interpretation is that star 
forming galaxies continuously enter the “compact massive” 

The evolution of compact quiescent galaxies may become more gradual 
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selection box (because of a decrease in their size and/or an 
increase in their mass), and quench shortly after. This contin¬ 
uous quenching then leads to a rapid build-up of the number 
of quiescent galaxies in the compact/massive selection region. 
We conclude that quenching likely dominates over rejuvena¬ 
tion: if rejuvenation dominated, we would have expected to 
see quiescent galaxies disappear as their star formation (re- 
jstarted, unless there are other channels to create quiescent 
compact galaxies. We note that the evolution of the number 
densities of the two populations is qualitatively similar to the 
simulations of Zolotov et al. (2015). 

It is difficult to determine how long it takes before a com¬ 
pact star forming galaxy turns into a quiescent galaxy, as this 
depends on the rate with which new galaxies enter the sam¬ 
ple. The number density of sCMGs is constant from z ~ 2.8 
to z ^ 1.8, which means that new sCMGs enter the sample 
at approximately the same rate as existing ones quench. We 
can obtain a very rough estimate of the “compact life time” 
of star forming galaxies Tc by adding the number densities of 
the sCMGs in the three redshift bins that cover this period: if 
the average quenching time is much shorter than the time in¬ 
terval between redshift bins, all galaxies in each bin are new 
arrivals and should be added to the sample of progenitors of 
quiescent galaxies. The combined number density in these 
bins (which are of nearly equal volume) is 2.0 x IQ-^Mpc"^, 
higher than the increase in the number density of the qCMGs 
over this period (1.3 x IQ-^Mpc"^). This implies that only 
about half of the star forming galaxies disappear from one red¬ 
shift bin to the next, and that the average quenching timescale 
is roughly equal to the time interval between the redshift bins: 
Tc ^ 0.5 Gyr. This is the average lifetime of star forming 
galaxies in the “compact massive” selection box, assuming 
that they all turn into quiescent galaxies. It is slightly lower 
than the value of ^ 0.8 Gyr derived by Barro et al. (2013), but 
judging from their Fig. 5 the two studies are broadly consis¬ 
tent. 

Although somewhat outside of the scope of this paper, we 
briefly discuss the number density evolution at lower red¬ 
shift. The number density of sCMGs drops precipitously after 
z ~ 1.8. This drop leads to a plateau in the number density 
of qCMGs: as the number of star forming progenitors de¬ 
creases, no new quiescent galaxies are added to the sample. 
At the lowest redshifts the number density of compact qui¬ 
escent galaxies decreases (as was also found by Taylor et al. 
2010b, van der Wei et al. 2014b, and van Dokkum et al. 2014, 
among others), while the number density of all massive qui¬ 
escent galaxies rises steeply (dashed red curve). The likely 
explanation is that the compact galaxies accrete extended en¬ 
velopes through merging from z ~ 1.5 to the present day (e.g., 
Bezanson et al. 2009; Naab et al. 2009; van Dokkum et al. 
2010, 2014; Newman et al. 2012; Hilz et al. 2013). 

Finally, we note that Fig. [T9| is not new: the peak in the 
number density of compact, massive quiescent galaxies was 
also shown in Cassata et al. (2011, 2013), Barro et al. 2013, 
and van der Wei et al. (2014b). Barro et al. (2013) derive a 
similar lifetime for star forming galaxies in the compact se¬ 
lection region. Although uncertainties remain (particularly at 
low redshift; see, e.g., Carollo et al. 2013), it is encouraging 
that these largely independent samples give similar results. 


at z > 3: Straatman et al. (2015) recently reported the existence of a size¬ 
able population of compact, massive quiescent galaxies at z ~ 4, based on 
medium-band near-IR photometry. 
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7.2. Morphologies and Radial Surface Brightness Profile 

The large spatial extent of the ionized gas raises the ques¬ 
tion whether the stellar half-light radii and masses of the 
compact star forming galaxies have been underestimated: al¬ 
though it is difficult to bias GALFIT measurements in this 
direction (see, e.g., Davari et al. 2014), it is possible that 
the galaxies have extended low surface brightness envelopes 
(see, e.g., Hopkins et al. 2009a). If such envelopes exists 
this would call into question whether sCMGs can be direct 
progenitors of compact quiescent galaxies with the same ap¬ 
parent mass and half-light radius. 

Images of th e galaxies are shown in Fig. |6] and in Fig. Al 
(see Sect. l3.3l l. Visually, most of the objects have a compact 
luminosity distribution and no spiral arms, clumps, star form¬ 
ing complexes, or other features outside of the dense center. 
Several of the reddest galaxies do not appear very compact: 
for example, UDS_42571 and, in particular, UDS_16442 are 
faint and fuzzy rather than bright and compact. The reason 
for their relatively low surface brightness is that dust obscu¬ 
ration has dramatically lowered their luminosity: as galaxies 
can have high M/L ratios, compact in mass does not neces¬ 
sarily imply compact in light. 

Two objects show unambiguous evidence for ongoing or 
past mergers: GOODS-S_30274 has an asymmetric feature 
resembling a tidal tail, and COSMOS_l 1363 is one compo¬ 
nent of a spectacular merger between two compact galaxies 
with a projected separation of 0"6 (5kpc). The companion 
of COSMOS_11363 is COSMOS_11337 in the Skelton et al. 
(2014) catalog. Our Keck/NIRSPEC and HST/WFC3 spec¬ 
troscopy confirms that they are at the same redshift. With 
re = l.Okpc and M^tars = 1.7 x 10 "Mq C0SM0S_1 1337 is 
actually significantly more compact than COSMOS_l 1363. 
Its rest-frame UVJ colors (just) give it a quiescent classifica- 
tion0 This merging pair seems to suggest that CMGs can 
form in mergers (Hopkins et al. 2009b), but that is not the 
right interpretation: as both galaxies already fall in the “com¬ 
pact massive” selection region, this particular type of merger 
actually decreases their number. Even if the result of the 
merger falls in the selection region, there will be one less 
CMG. Interestingly, several other galaxies show evidence for 
distorted outer isophotes in Eig. Al. This could indicate in¬ 
teractions are common for these galaxies, but the evidence is 
not conclusive at the depth of the CANDELS imaging. 

To quantify the stellar emission on scales ^ 1 kpc we 
stacked the H\^q images of the 25 sCMGs and measured their 
averaged radial surface brightness profile to faint levels. Each 
galaxy was normalized by its total flux prior to stack¬ 
ing, so that the stack is not dominated by a few bright ob¬ 
jects. Neighboring objects, identified from the SExtractor 
segmentation map Gee Bertin & Arnouts 1996; Skelton et al. 
2014), were masked. The resulting surface brightness pro¬ 
file is shown in the top panel of Eig. |20] (blue points). We 
fit the stack with a PSE-convolved Sersic profile to determine 
whether there is evidence for an additional component at large 
radii. This fit, done with GALEIT (Peng et al. 2002), is 
shown by the blue line. It is an excellent description of the 
data out to 15 kpc (> lOr^.): there is no excess light beyond 
a single Sersic profile. Eurthermore, the best-fitting effective 
radius (r^, = 1.3kpc) and Sersic index (n = 3.6) are similar to 
the median values of the 25 galaxies that went into the stack: 

We note that the rest-frame J magnitudes of these objects are somewhat 
uncertain as they rely on accurate deblending of the IRAC fluxes; it may well 
be that both galaxies are sCMGs. 



Figure 20. Radial surface brightness profile, measured from a stack 
of all 25 sCMGs in our spectroscopic sample (blue points). The pro¬ 
file is very well fit by a single Sersic profile, convolved with the PSF 
(blue line). There is no excess emission at large radii. For compar¬ 
ison, the red points and red line are for qCMGs that were selected 
to have the same median size and mass as the sCMGs. Their profile 
is virtually identical to the star forming galaxies. The bottom panel 
shows color profiles for both samples. The galaxies have modest 
color gradients, with the outskirts slightly bluer than the centers. 

fe) = 1.4kpc and (n) =4.3. 

The stacked sCMG profile is compared to a stacked qCMG 
profile, shown in red in Fig. |20l The qCMGs in this Figure 
are a subset of the full population: they were selected in nar¬ 
row bins of mass and effective radius, centered on the median 
values of the 25 sCMGs. This ensures that any differences 
between the stacks are not caused by a difference in the mean 
size or mass of the samples. The quiescent profile is virtu¬ 
ally indistinguishable from that of the star forming galaxies. 
Finally, 7i25-77i60 color profiles of both stacks are shown in 
the bottom panel of Fig. |20] Both stacks are bluer at larger 
radii and the gradients are small, qualitatively consistent with 
previous work (Szomoru, Franx, & van Dokkum 2012). The 
negative color gradients imply that the galaxies are even more 
compact in mass than in light, and that any stellar emission at 
r^ re is not missed because it is enshrouded in dust. 

We conclude that the morphologies of the sCMGs are con¬ 
sistent with being direct progenitors of qCMGs. When se¬ 
lected to have the same mass and effective radius, their sur¬ 
face brightness profiles are indistinguishable out to at least 
15 kpc. We find a relatively high Sersic index for both popu¬ 
lations. Such high valu es (and the relatively round 3D mor¬ 
phologies; see Sect. l5.21 i are consistent with violent relaxation 
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following a merger, but also with composite structures, such 
as envelopes of material around extremely compact exponen¬ 
tial disks. 

7.3. Star Formation Rates and Gas Content 

Accepting that the sCMGs are direct progenitors of 
qCMGs, an important question is whether they are forming 
a large fraction of the stars that are present in their quiescent 
descendants. If the life times of the sCMGs are short, or the 
star formation rates are low, they may account for only a small 
fraction of the total stellar mass in compact massive galaxies 
at z ^ 2. We address this question in Fig. |2T] which shows 
the relation between the specific star formation rate and com¬ 
pactness within the sample of compact, massive galaxies at 
2<z<2.5. 



Figure 21. Relation between specific star formation rate and com¬ 
pactness (oc Mstais/re), for galaxies in the “massive, compact” selec¬ 
tion box at 2 < z < 2.5. Red points are UVJ quiescent galaxies; blue 
points are UVJ star forming galaxies. Within the sample of massive 
compact galaxies, the specific star formation rate, and the fraction of 
UVJ star forming galaxies, declines with the degree of compactness. 
The right axis is the fraction of mass that will be added to the galax¬ 
ies in 0.5 Gyr, which is the estimated average lifetime of star forming 
galaxies in the massive, compact region. About 1 /3 of the mass of 
compact quiescent galaxies was formed in the compact phase. 

The right axis of this figures shows the fraction of the total 
stellar mass that is formed in the compact phase: 

- SSFR xwXTc, (29) 

.^stars 

with SSFR the specific star formation rate, w a correction for 
mass loss due to stellar winds, and Tc the average life time 
of star forming galaxies in the compact, massive selection re¬ 
gion. The median specific star formation rate of the sCMGs 


is SSFR= 1.2 X 10-‘^yr-i, and for w - 0.6 (Chabrier 2003) 
and Tc ^ 0.5 Gyr (Sect. 17.11) we find 0.4Mstars- As they 
are, on average, observed halfway through their lifetime in 
the compact selection region, their final mass before quench¬ 
ing will be Mstars.finai = Mjtars + 0.5Mc 1.2Mstars, and the frac¬ 
tion of Mstars.finai that is formed in the compact phase is then 
^1/3. We conclude that sCMGs are responsible for forming 
a significant fraction of the stars that are present in compact 
quiescent galaxies. 

An implication of this result is that the spatial distribution 
of the Ha emission in sCMGs is probably more extended 
than the spatial distribution of star formation in these galax¬ 
ies. This is qualitatively similar to results for galaxies atz^ I 
(Nelson et al. 2012, 2015), and may indicate that star forma¬ 
tion has ceased in the inner regions of the galaxies (e.g., Gen- 
zel et al . 20 14b; Tacchella et al. 2015). However, as discussed 
in Sect. l4.2l most of the star formation in sCMGs is obscured, 
and the observed Ha emission accounts for only ^ 10 % of the 
total star formation. As the column density is a very strong 
function of radius in these compact galaxies (see, e.g., Gilli 
et al. 2014; Nelson et al. 2014), the obscuration-corrected 
distribution of star formation is almost certainly much more 
compact than the observed distribution of Ha emission - at 
least for the galaxies with low observed velocity dispersions. 

A somewhat puzzling aspect of the sCMGs is that they have 
very high specific star formation rates even though their ob¬ 
served ki nema tics leave little room for a large gas reservoir 
(see Sect. |63- Many studies have found that the molecular 
gas and dust content of galaxies increases with redshift, and 
reaches > 50 % of the total baryonic mass for z ^ 2 galaxies 
with the highest star formation rates (e.g., Tacconi et al. 2010; 
Daddi et al. 2010; Genzel et al. 2015; Scoville et al. 2015). 
Using the scaling relations derived in Genzel et al. (2015), the 
expected gas fraction for the galaxies in our sample is ^ 60 %. 
One possible explanation for their relatively low gas fraction 
is that the galaxies have nearly exhausted their reservoir and 
are about to quench. If the galaxies typically build ^ 40 % of 
their mass inside the compact, massive selection region, the 
average sCMG should have ~ 30 % of their mass in gas (for 
w ^ 0.6); this is just consistent with the 95 % confidence up- 

S limit on the gas fraction of 40 % that we derived in Sect. 

Another explanation is that newly accreted gas is contin¬ 
uously and efficiently funneled into the central regions, and 
the star formation rates are “accretion throttled” (Dekel et al. 
2009); in that case the gas depletion time can be shorter than 
the actual duration of star formation (see, e.g., Genzel et al. 
2010). Direct observations of the dust and molecular gas in 
sCMGs, at ^ 1 kpc resolution, are needed to address these 
questions. 

Finally, we note that star forming galaxies tend to be less 
compact than quiescent galaxies even within the population 
of compact massive galaxies at 2 < z < 2.5 (see Fig. lMll . As 
disc ussed earlier in the context of the sample selection (Sect. 
12.41) . star forming galaxies are always less compact than qui¬ 
escent galaxies, irrespective of the precise criteria for their 
selection. In the next Section we interpret the distribution 
of galaxies in the size-mass plane in the context of a simple 
model, in which star forming galaxies become gradually more 
compact and the probability of quenching rises smoothly as 
their compactness increases. 

8. FORMATION OF STAR FORMING COMPACT 
GALAXIES 
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8.1. A Simple Model for Building Massive Galaxies 

In this Section we turn to the formation of compact, mas¬ 
sive star forming galaxies. Several distinct mechanisms have 
been discussed in the literature, including mergers of gas-rich 
galaxies (Tacconi et al. 2008; Hopkins et al. 2009b; Hammer 
et al. 2009; Wellons et al. 2015), in-situ, inside-out growth of 
even more compact progenitors (Oser et al. 2010; Johansson, 
Naab, & Ostriker 2012; Williams et al. 2014; Nelson et al. 
2014; Wellons et al. 2015), “compaction” of the gas in star 
forming galaxies due to disk instabilities (Dekel & Burkert 
2014), and hybrid models that include several of these effects 
(Zolotov et al. 2015). 

Although individual massive galaxies likely have complex 
formation histories, including periods of compaction, merg¬ 
ers, and star bursts, the population of massive galaxies should 
follow a particular track in the size-mass plane that is deter¬ 
mined by the dominant mode of growth when the evolution 
of many galaxies is averaged. Tracks derived from observa¬ 
tions and simulations are shown in Fig. |22] The blue and red 
tracks show the evolution of galaxies matched by their cumu¬ 
lative number density, for (relatively) low mass galaxies (van 
Dokkum et al. 2013, blue) and high mass galaxies (Patel et al. 
2013, red). The solid parts of the curves are for 1.5 < z < 3 
and the dotted parts for 0 < z < 1.5. Low mass galaxies evolve 
along a single track with a slope of ^ 0.3. High mass galaxies 
evolve along a similar track from Z"^3 toz"^ 1.5 but then 
turn “upward”, around the time when star formation ceases 
and the growth becomes dominated by dry mergers (see Sect. 
l9Tt . 

Magenta, orange, and black curves are from simulations. 
The magenta tracks are the wind models shown in Fig. 10 
of Hirschmann et al. (2013), for two different mass ranges. 
These models are the same as those in Genel et al. (2012), and 
are updated versions of the momentum-driven wind models 
of Oppenheimer & Dave (2006) in cosmological simulations. 
They include both winds and metal enrichment; as shown in 
Hirschmann et al. (2013) models without winds predict some¬ 
what steeper relations between size growth and mass growth. 
The orange curve is the track of galaxies in the Illustris project 
(Vogelsberger et al. 2014), as shown in Fig. 5 of Wellons et al. 
(2015). This is the average track of all galaxies with a stellar 
mass in the range 1 -3 x 10^' Mq at z = 2. The thin black 
curves show the evolution from z = 3 to z = 1.5 of individ¬ 
ual galaxies in the simulations of Zolotov et al. (2015). We 
include all 34 simulations, irrespective of whether they have 
a “compaction” phase. The thick dashed curve was created 
by averaging the evolution in these simulations. The num¬ 
ber density-matched observational samples and the simula¬ 
tions all suggest that the ensemble-averaged evolution of star 
forming galaxies in the size-mass plane is well approximated 
by 

Alogre =0.3AlogMstars, (30) 

that is, galaxies increase their size by a factor of 2 for every 
factor of 10 evolution in their mass. This simple inside-out 
growth model is qualitatively consistent with a host of other 
data and theory, including the expected growth of disks in 
ACDM (e.g.. Mo, Mao, & White 1998) and the distributions 
of star formation and existing stars in galaxies (e.g.. Nelson 
et al. 2012). Interestingly, this track corresponds to an ap¬ 
proximately constant 3D density within the effective radius 
(as p(re) cx M/rl, it follows that rg cx if the density is 
constant). 



Figure 22. Tracks of galaxies in the size-mass plane in different stud¬ 
ies. The solid blue and red curves show the evolution from z ~ 3 
to z = 1.5 of number density-matched samples of low mass (van 
Dokkum et al. 2013) and high mass (Patel et al. 2013) galaxies. 
Broken curves show the evolution at z < 1.5. Magenta tracks are 
the wind models of Hirschmann et al. (2013), for two different mass 
ranges and 1.5 < z < 2.5. The orange curve is the evolution of the full 
sample of massive Illustris galaxies from z=3toz=1.5in Wellons 
et al. (2015). Thin black curves are individual simulated galaxies 
in Zolotov et al. (2015), from z = 3 to z = 1.5. The mean Zolotov 
evolution is indicated by the thick black dashes. The green arrow 
is a good match to the mean growth of galaxies in all these studies: 
Alogre ~ 0.3AlogMstars. 


Although the 3D density within the effective radius stays 
constant, a direct consequence of Eq.[^is that the stellar den¬ 
sity within a physical radius, the stellar surface density, and 
the stellar velocity dispersion all gradually increase as galax¬ 
ies form stars. We assume that galaxies have an increasing 
likelihood of quenching as their velocity dispersion reaches 
a particular threshold. This is motivated by numerous stud¬ 
ies showing that the specific star formation rates of galaxies 
correlate much better with compactness than with mass (e.g., 
Kauffmann et al. 2003; Franx et al. 2008). We parameterize 
this process by a dispersion-dependent quenching probability 
Pq: 

P^ = 0 (x<10.6) 

X— 10.6 

= —^ (10.6 <x< 10.9) 

= 1 (x>10.9), (31) 

with X = logMstars “logre (see Fig. l23t . Galaxies begin to 
quench at logMjtars “logre > 10.6, or cTq = 225kms“' (Eq. 
B- As we show below this particular choice of CTq pro¬ 
vides a reasonably good fit to the data over the redshift range 
1.5 < z < 3.0. We use a single value in this paper, but we note 
that the threshold is a function of redshift: low redshift galax¬ 
ies quench at a lower density or dispersion than high redshift 
galaxies (Eranx et al. 2008). 

The average mass growth of the population is assumed to 
be a simple function of the star formation rate, modified by 
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Figure 23. Parameterization of quenching. No galaxies with low ve¬ 
locity dispersions are quenched, and all galaxies with high velocity 
dispersions are quenched. The quenching probability begins to in¬ 
crease at logMstars -logre = 10.6. This threshold is held fixed in this 
paper, but is in fact redshift dependent. 

the quenching function; 

A logMstans = /3Af X SFR X (1 - P^). (32) 

The parameter /3 encompasses mass loss due to stellar winds, 
possible effects of mergers, and the well-documented offset 
between the evolution of the star forming sequence and the 
evolution of the stellar mass function (see Leja et al. 2015; 
Papovich et al. 2015, and references therein). We adopt 
P = 0.45; values of 0.4 < /3 < 0.5 produce very similar re¬ 
sults. A pure mass loss model would have /3 = w ~ 0.6 for a 
Chabrier (2003) IMF. The star formation rate is given by the 
star forming “main sequence”. We adopt the mass-dependent 
parameterization of Whitaker et al. (2014): 

log(SFR) = a-l-h logMstars+ c(logMstars)^, (33) 

with a = -19.99, b = 3.44, and c = -0.13 for the redshift range 
of interest. As shown in Fig.|4j; the actual star formation rates 
of sCMGs are broadly consistent with this relation. 

The model is illustrated in Fig. |24l which shows galaxies 
in the size-mass plane at 1.5 < z < 2.25. The color indi¬ 
cates the fraction of galaxies that are quiescent according to 
the UVJ criteria. Galaxies move along the green curves un¬ 
til they cross the yellow line, when their quenching probabil¬ 
ity rises steeply. In this model galaxies follow parallel tracks 
in the size-mass plane, which means that large galaxies and 
small galaxies at fixed mass have different formation histo¬ 
ries. However, we emphasize that individual galaxies likely 
have complex histories, involving excursions above and be¬ 
low these mean tracks (see, e.g., Zolotov et al. 2015). Our de¬ 
scription is qualitatively similar to the work of Williams et al. 
(2014, 2015), who identified low mass Lyman break galaxies 
with small sizes as possible progenitors of quiescent compact 
massive galaxies. 

8.2. Testing the Model 

We test the model in the following way. We first quan¬ 
tify the distribution of galaxies in the size-mass plane at 
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Figure 24. Illustration of the “parallel track” model of massive galaxy 
evolution. The blue and red squares show the distribution of galaxies 
in the size-mass plane at 1.5 < z < 2.25, with the size of the square 
proportional to the number of galaxies and the color indicating the 
fraction of quiescent galaxies. Galaxies move along parallel tracks in 
the size-mass plane, with Alogre 0.3 A logMstars, until they cross 
the yellow quenching line of constant aq ~ 225 km s“*. 


2.25 < z < 3.0, by measuring the number of galaxies in bins 
of 0.1 dex X 0.1 dex (see Fig.l25hL Next, we evolve this dis¬ 
tribution forward in time, using timesteps of Af = lOOMyr. 
For each combination of (Mstars, a) we can calculate the SFR 
from Eq.[^ Pq from Eq.[3Tl the change in mass from Ea.l^ 
and the corresponding change in size from Eq. 

The evolved distribution after 10 timesteps (i.e., 1 Gyr) 
is shown in Eig. l25b . with a small (4 %) correction to ac¬ 
count for the volume difference between 2.25 < z < 3.00 
and 1.50 < z < 2.25. As expected, the galaxies have shifted 
to larger masses and to slightly larger radii in the size-mass 
plane. The distribution artificially falls off at low masses due 
to the Mstars = 10 '°Mq limit in Eig.l25fa. This limit was cho¬ 
sen to ensure that the galaxies with the lowest masses and 
highest redshifts have robust size measurements: the median 
brightness of the 28 galaxies with 10.0 < logMstars < 10.1 and 
2.9 < z < 3.0 is (//i6o) = 23.9, well within the regime where 
size measurements are reliable (see van der Wei et al. 2014b). 

The observed distribution of galaxies at 1.50 < z < 2.25 
is shown in Eig. l25b . In panel (d) this observed distribu¬ 
tion is multiplied by a weight mask, to account for the arti¬ 
ficial fall-off at low masses in panel (b). The weight mask 
was constructed by evolving a galaxy population with a uni¬ 
form density distribution in the size-mass plane and a cutoff 
at Mstars < IO'^Mq forward in time (in the same way as de¬ 
scribed above). The distribution in Eig. l25l l is remarkably 
similar to that in Eig.l25b. Eurthermore, the total number den¬ 
sity of galaxies in the two panels is almost identical; panel (d) 
has 7 % less galaxies than panel (b). 

In Eig. |26]the color-coding reflects the specific star forma¬ 
tion rates of the galaxies, with redder squares indicating a 
lower SSER. The figure looks very similar when the fraction 
of quiescent galaxies is used for the color coding instead of the 
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Figure 25. Testing the “parallel track” model for the creation of compact massive galaxies. Panel (a) shows the observed number density of 
galaxies in the size-mass plane at 2.25 < z < 3.00, with the grey scale proportional to the number of galaxies. In panel (b) the distribution is 
evolved forward in time by l.OGyr to 1.50 < z < 2.25, by assuming that galaxies grow along lines of Alogr^ = O.SAlogMstiu-s and quench 
after they pass the yellow line. Panel (c) shows the observed number density of galaxies at 1.50 < z < 2.25. Panel (d) is identical to panel 
(c), but weighted to account for the edge effect at low masses in the model prediction of panel (b). The distribution of galaxies in panel (d) is 
remarkably similar to that in panel (b), demonstrating that compact massive galaxies at z ~ 2 can be formed by simple mass growth of galaxies 
at higher redshift. 


SSFR. The sizes of the squares are proportional to the num¬ 
ber of galaxies. The model naturally produces a population 
of quiescent galaxies with Mstans lO" Mq and r^ ^ 1 kpc. 
In our model, the progenitors of these galaxies have masses 
of ^ 3 X 10 '° Mq and sizes of ^ 0.7 kpc at z ^ 3. The model 
does not produce the right fraction of quiescent galaxies at the 
highest masses and largest sizes; many of these galaxies are 
forming stars at z ^ 1.9 even though they have high galaxy- 
averaged velocity dispersions. This suggests that our qu ench - 
ing prescription is too simplistic in this regime (see Sect. l8.31 l. 

We compare the predicted to the observed number densi¬ 
ties explicitly in Fig. |27] This Figure highlights the excel¬ 
lent match of our model to the size distribution of all galax¬ 
ies over the entire mass range 10.5 < logMstars < 11.5: it not 
only reproduces the peak in the distribution at ^ 2.5 kpc 


but also the “shoulder” of compact quiescent galaxies. It also 
demonstrates that the modeling of quenching is too simplistic 
for large galaxies, as was already clear from the comparison 
of panels (b) and (d) of Fig. |26] In particular, nearly 100 % 
of galaxies with re > 2 kpc are forming stars in the model, 
whereas the observed star forming fraction is only ^ 85 %. 

8.3. Summary of the Modeling 

In summary, we have shown that the population of com¬ 
pact, massive galaxies at z ^ 2 can be explained by a model in 
which galaxies form stars at a rate that is dictated by the star 
forming sequence, experience a modest increase in size for 
a given increase in mass, and quench after passing a veloc¬ 
ity dispersion threshold. This was demonstrated by evolving 
the observed galaxy population at z ~ 2.6 forward by 1 Gyr 
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Figure 26. Same as Fig.[^ but with color coding indicating the median specific star formation rate of the galaxies. Our simple model naturally 
produces a population of massive, compact quiescent galaxies at Mstars ~ 1O'*M0 and re ~ 1 kpc. The model overpredicts the quiescent 
fractions at the largest masses and sizes. 


to z ^ 1.9. This is a critical period as the number density of 
qCMGs increases by an order of magnitude over that redshift 
range. 

Although it is beyond the scope of this (already somewhat 
unwieldy) paper, we note that the modeling can easily be ex¬ 
tended. In particular, it would be straightforward to fit for the 
two tunable parameters (the quenching dispersion ag and the 
parameter /3, which relates the mass growth to the star for¬ 
mation rate). Furthermore, our quenching description is in¬ 
adequate in the high mass / large size regime; the yellow line 
in Fig. |24] is somewhat too steep. A possible explanation is 
that quenching depends on the galaxy properties in the central 
^ 1 kpc, and the simple Mstars/A criterion no longer “works” 
in a regime where ^ 1 kpc. Some evidence for this comes 
from a study of the mass in the central < 1 kpc of galax¬ 
ies (van Dokkum et al. 2014): as we showed in Fig. 9 of 
that paper the mass inside of 1 kpc is an excellent predictor of 
quiescence at all redshifts. Finally, the modeling can be ex¬ 
tended to lower redshifts, taking evolution in ag into account 


(see Sect. 123). 

9. DISCUSSION 

9.1. The Formation of Today’s Massive Galaxies 

In the preceding sections we discussed a simple model for 
the evolution of massive galaxies at 2 < z < 3: they grow 
inside-out with Alogre ~ 0.3AlogMstars (Ea.l30ll while they 
are forming stars, and quench when they reach a density or 
velocity dispersion threshold. This model provides an expla¬ 
nation for the fact that large galaxies have younger stellar pop¬ 
ulations than small galaxies at fixed mass (e.g., Franx et al. 
2008), as only the smallest galaxies have reached the quench¬ 
ing threshold. Galaxies enter the massive, compact selection 
region in the size-mass plane “from the left”, that is, by in¬ 
creasing their masses. This seems different from models in 
which large, massive galaxies enter this region “from above”, 
that is, by decreasing their sizes through mergers (e.g., Hop¬ 
kins et al. 2009b) or by gas “compaction” followed by star 
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Figure 27. The number density of galaxies as a function of size at 
1.50 < z < 2.25, in two mass bins. Points with errorbars are the ob¬ 
served values; black points show all galaxies and red points show 
quiescent galaxies only. The lines are the predicted distributions 
in our model, that is, the observed distribution at 2.25 < z < 3.00 
evolved forward in time by 1.0 Gyr. The size distributions are well 
reproduced in this model, in both mass bins (black lines). The match 
to the subset of quiescent galaxies is very good at the smallest sizes 
but shows systematic differences at intermediate and large sizes. 

formation (Dekel & Burkert 2014). This apparent difference 
may reflect a difference in approach: in this paper we are con¬ 
cerned with the average evolution of the population of mas¬ 
sive galaxies, whereas simulations such as those of Zolotov 
et al. (2015) are able to follow the tracks of individual galax¬ 
ies in the size-mass plane. Judging from the Zolotov et al. 
(2015) tracks, Eq. IMlmav simply be the time- and popula¬ 
tion average of periods of proportional size and mass growth 
(Alogre ~ AlogMstars), pcriods of compaction, and the ef¬ 
fects of mergers o 

At lower redshifts massive galaxies evolve along a 
markedly different track in the size-mass plane: van Dokkum 
et al. (2010), Patel et al. (2013), and others find that the size 
and mass evolution of massive galaxies are related through 
Alogre ^ 2AlogMstars at 0 < z < 2 (as indicated by the dot¬ 
ted section of the red curve in Fig.l22li. This evolution can be 
explained by minor, gas-poor mergers building up the outer 
envelopes of galaxies (Bezanson et al. 2009; Naab et al. 2009; 
Hopkins et al. 2010; Hilz et al. 2013). In van Dokkum 
et al. (2010) we showed that any physical process that de¬ 
posits mass at r > A leads to a steep track in the size-mass 
plane, due to the definition of the effective radius. 

Note that the term “compaction” refers to the gas, not the stars; in the 
Zolotov et al. models the (indirect) effect on the stellar effective radius is 
generally much smaller than that on the gas radius. 


A schematic of the growth of massive galaxies from z ^ 3 
to z ~ 0 is shown in Fig. 123 After galaxies quench, their 
mass growth per unit time is reduced, but their effective radii 
continue to increase. This Figure suggests that there are mul¬ 
tiple paths leading to large, massive, quiescent galaxies in the 
local Universe, as was also noted in, e.g., Cappellari et al. 
(2013) and Barro et al. (2014a). Their z ^ 2 progenitors can 
be large star forming (disk) galaxies, such as those studied ex¬ 
tensively by, e.g., Genzel et al. (2008) and Forster Schreiber 
et al. (2011), or compact, massive, quiescent galaxies that 
have grown through mergers (e.g., Trujillo et al. 2011; Patel 
et al. 2013; Ownsworth et al. 2014). As shown in Fig. 2 of 
van Dokkum et al. (2014) massive z = 0 galaxies have a large 
range of central densities at fixed total mass, as expected in 
such scenarios. It is possible that massive SO galaxies formed 
from large star forming galaxies and massive elliptical galax¬ 
ies formed from compact star forming galaxies, although it 
remains to be seen whether the stellar populations of massive 
early-type galaxies are sufficiently diverse to accommodate a 
large range in formation histories (Gallazzi et al. 2005; van 
Dokkum & van der Marel 2007). 



Figure 28. Illustration of possible average tracks of galaxies in the 
size-mass plane from z ~ 3 to z ~ 0. While they are forming stars, 
galaxies grow mostly in mass and gradually increase their density. 
After reaching a velocity dispersion or stellar density threshold (the 
yellow line, whose location is redshift dependent) they quench, due 
to AGN feedback or other processes that correlate with stellar den¬ 
sity. The dominant mode of growth after quenching is dry merging, 
which takes galaxies on a steep track in the size-mass plane. 


9.2. Winds, Shocks, and AGN 

In this paper we mostly ignored the effects of AGN, despite 
the fact that nearly half of the 25 galaxies with Keck spec¬ 
tra have X-ray luminosities above the canonical AGN limit of 
Lx > 10"^^ ergs s“'E!l The reason is that these effects are diffi¬ 
cult to constrain and quantify. Barro et al. (2013) discuss the 
high occurrence rate of AGN in compact star forming galaxies 
extensively, and argue that they are the agent of quenching. 
This may be true: in many galaxy formation models AGNs 

The number of galaxies with active nuclei could be even higher, as the 
X-ray selection is biased against Compton-thick AGN (see, e.g., Fiore et al. 
2008 ). 
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play a crucial role in quenching star formation precisely in 
this mass and redshift range (e.g., Croton et al. 2006; Hopkins 
et al. 2008). However, the star formation rates of the sCMGs 
are (still) high and consistent with the z ^ 2.3 star forming 
sequence (Whitaker et al. 2014), and there is no evidence for 
a direct effect of the AGNs on star formation. Turning this 
around, it is obviously the case that the black holes are grow¬ 
ing in these galaxies, and that they are growing at a time when 
the dense stellar centers are also growing. This is not surpris¬ 
ing, as it is difficult to see how to avoid a high accretion rate 
onto the central object in these extremely dense, highly star 
forming galaxies. 

An obvious point of concern is that the presence of AGNs 
causes errors in the derived physical parameters of the galax¬ 
ies. In principle, an AGN in a relatively low mass, relatively 
large, and relatively quiescent galaxy could push the galaxy 
in the sCMG category: the extra light of the AGN could be 
mistaken for star light, increasing the mass; the combination 
of a point source with a normal galaxy could be mistaken for a 
compact bulge-dominated object; and the hot IR flux from the 
AGN could be mistaken for PAH features from star formation. 
This can only be addressed properly with data of much higher 
spatial resolution than is available today, but we note here that 
the galaxies with AGNs do not stand out in any of the figures. 
The only exception is that the four galaxies with the highest 
measured velocity dispersions all have X-ray AGN, and also 
[Nii]/Hq: ratios of ^ 1. We have treated these four galaxies in 
the same way as the others. 

A related issue is the almost-certain presence of galactic- 
scale winds and outflows. Such winds can be driven by star 
formation (e.g., Heckman, Armus, & Miley 1987) and/or 
AGNs (e.g., Proga, Stone, & Kallman 2000) and are ubiq¬ 
uitous in star forming galaxies at high redshifts (Franx et al. 
1997; Pettini et al. 1998; Forster Schreiber et al. 2014; Gen- 
zel et al. 2014a). Galactic superwinds can create bubbles and 
shock fronts whose kinematics, spatial extent, and emission 
line ratios are very similar to what we observe. In at least one 
of the galaxies in our sample, COSMOS_1014, there is evi¬ 
dence for a broad Ha line in addition to a narrow component, 
similar to IRAS 11095-0238 (Soto & Martin 2012) and galax¬ 
ies in Forster Schreiber et al. (2014). Furthermore, four of the 
galaxies in our sample are part of the sample of massive galax¬ 
ies of Genzel et al. (2014a) (COSMOS_l 1363, GOODS- 
S_30274, GOODS-S_37745, and GOODS-S_45068), and 
they And broad nuclear velocity components in two of them 
(COSMOS_11363 and GOODS-S_30274). A detailed study 
of the kinematics and line ratios of GOODS-S_30274 was 
also done by van Dokkum et al. (2005). 

Although winds are almost certainly present, two results 
suggest that they are not dominating the galaxy-integrated 
emission line widths. First, winds tend to escape in a di¬ 
rection perpendicular to the plane of the galaxy (Heckman, 
Armus, & Miley 1990), which is difficult to reconcile with 
the observed anti-correlation between velocity dispersion and 
axis ratio (Fig. El. Second, the observed kinematics are fully 
explained by the stellar mass, leaving little room for addi¬ 
tional broadening due to winds. In fact, broad components 
in the velocity profiles are expected just from rotating gas at 
small radii: as shown in Fig. [18] gas at ^ 1 kpc should have 
FWHM Ri 1000 km s“' even in the absense of winds. Judging 
from other z ^ 2 galaxies the disks are also likely to be highly 
turbulent, with a relatively high internal dispersion (see, e.g., 
Cresci et al. 2009; Forster Schreiber et al. 2009). The gaseous 


environments of sCMGs may be similar to those of ULIRGs, 
which are highly complex: as shown in Soto & Martin (2012) 
they can have rotating, large-scale disks in addition to out¬ 
flows and shocks. 

Finally, we note that the presence of spatially-extended gas 
disks in these galaxies had been predicted by Zolotov et al. 
(2015). They also predicted that the gas dispersions are, on 
average, lower than the stellar dispersions (Fig. [TOkl. as the 
gas is in disks which are sometimes seen face-on. Interest¬ 
ingly, Zolotov et al. (2015) also And that the gas constitutes 
only a small fraction of the total baryonic mass of the sim¬ 
ulated compact massive star forming galaxies, although they 
note that this result is sensitive to the feedback prescription. 
Similarly, Johansson et al. (2012) predicted that compact, 
massive galaxies are stellar mass-dominated and have Kep- 
lerian rotation curves; the model rotation curves in their Fig. 
7 are remarkably similar to the inferred rotation curve shown 
in our Fig. [18] 

9.3. Submm-Galaxies, Far-IR Selected Galaxies, and 
Quasars 

This study begins with an HST/WFC3-selected sample in a 
total area of ^ 0.25 square degrees. Many other studies have 
found extreme star forming galaxies by selecting them on the 
basis of their far-infrared, submm, or radio emission instead 
(e.g., Kormendy & Sanders 1992; Sanders & Mirabel 1996; 
Barger et al. 1998; Small et al. 2000; Barger et al. 2001; 
Casey et al. 2012). These extreme galaxies are plausible an¬ 
cestors of early-type galaxies; as an example, Tacconi et al. 
(2008), Toft et al. (2014), and Simpson et al. (2015) have 
suggested that many submm galaxies could be direct progen¬ 
itors of compact quiescent galaxies at z ^ 2. 

We do not select against such objects, and our sample 
should include the proper number of submm galaxies, ra¬ 
dio galaxies, and other extreme objects. However, there are 
(at least) two possible reasons why galaxies selected at other 
wavelenghts could be underrepresented in our sample: some 
fraction may be too faint in the near-IR to be included (or to 
be properly characterized) in the Skelton et al. (2014) cata¬ 
logs, and some may be too rare to be represented in the 3D- 
HST/CANDELS area. sCMGs have such high column densi¬ 
ties in the central regions that some may be entirely obscured 
at rest-frame optical wavelengths (Gilli et al. 2014; Nelson 
et al. 2014). Wang, Barger, & Cowie (2012) and Caputi et al. 
(2014) show that objects exist that are relatively bright in the 
IRAC bands but that are undetected in deep near-IR data. It 
is obviously difficult to measure the redshifts and masses of 
these objects with traditional means, but it may be possible 
using molecular lines (see Walter et al. 2012; Riechers et al. 
2013). In the context of the study presented here the question 
is not whether any massive, compact, “optically-dark” galax¬ 
ies were missed, but what fraction of mass and star formation 
is in such objects. 

The second class of potentially missed objects are ex¬ 
tremely rare, extremely luminous galaxies. The median 
star formation rates of sCMGs in our study is (SFR) = 
134 Mq yr“', and we have 112 such objects at 2 < z < 2.5. 
Therefore, objects that are so rare that there are only a few 
(or zero) in our survey volume must have star formation rates 
> 5OOOM0yr“' to have a significant impact on our results. 
This seems extreme, but such objects probably exist: the most 
extreme Herschel-selected galaxies at 2 < z < 5 have esti¬ 
mated star formation rates up to ~ 9OOOM0yr“^ (Casey et 
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al. 2012). Furthermore, recently identified highly obscured 
quasars have bolometric luminosities of Lboi 10"^^ ergs s“' 
(Banerji et al. 2012, 2015), and it seems likely that the growth 
of the black holes in these objects is accompanied by prodi¬ 
gious star formation. It remains to be seen whether such ob¬ 
jects are sufficiently common (or rather, long-lived) to impact 
results derived from CANDELS-sized areas. 

Finally, we note that we do not find a correlation between 
size and IR luminosity at fixed stellar mass, that is, an IR se¬ 
lection does not preferentially select compact galaxies but ob¬ 
jects with a wide range of rest-frame optical sizes (see also 
Wiklind et al. 2014; Simpson et al. 2015). As an IR selection 
is effectively a star formation selection at high masses (see, 
e.g., Whitaker et al. 2012; Rodighiero et al. 2014), this is 
perhaps not surprising. 

10. SUMMARY AND CONCLUSIONS 

In this paper we have identified a population of star form¬ 
ing, compact, massive galaxies in the five fields of the CAN- 
DELS and 3D-HST surveys. Such objects have been studied 
previously by Barro et al. (2013, 2014b, 2014a) and Nelson 
et al. (2014), and we build on their results. Compared to the 
Barro et al. studies, our selection is more restrictive, focus¬ 
ing only on the most massive and most compact galaxies; we 
study an area that is ^2.5 times larger; and our redshift cat¬ 
alogs make use of the 3D-HST grism spectra for all objects 
brighter than i/ieo < 24. 

We first confirm the redshifts and masses of the galaxies us¬ 
ing Keck MOSFIRE and NIRSPEC spectroscopy of 25 com¬ 
pact massive star forming galaxies at 2 < z < 2.5. The gas 
dynamics suggest that the galaxies are embedded in spatially- 
extended rotating disks; this explains the low measured dis¬ 
persions of a large fraction of the sample and the observed 
anti-correlation between the disperion and the axis ratio of 
the galaxies. Support for this interpretation comes from direct 
measurements of the sizes of the Ha disks for 10 galaxies; 
the fact that this is possible at all from ground-based, seeing- 
limited data already shows that the gas extends to scales 
^ 1 kpc. The derived sizes of the gas disks, and the fall-off 
of the rotation curve that we construct for the galaxies, are in 
very good agreement with recent models for the formation of 
massive galaxies (Johansson et al. 2012; Zolotov et al. 2015). 

It is important to note that, in our interpretation, the mea¬ 
sured gas velocity dispersions of the galaxies generally do 
not reflect the true Uot in the stellar body. We predict that 
the (inclination-corrected) velocities at r < 1 kpc are 400 - 
500km s“'for all galaxies. This can be tested with adaptive 
optics-assisted observations of the Ha line. There is evidence 
for b road components in several of the velocity profiles (see 
Sect. I9.21 i. and these complex profiles may reflect the com¬ 
bined effect of high rotation velocities at small radii and lower 
velocities at larger radii. A more direct measurement could 
come from CO line widths, as these likely probe much smaller 
radii than the Ha emission (see, e.g., Downes & Solomon 
1998). 

Next, we interpret the existence of star forming, compact 
galaxies at 2 < z < 2.5 in the context of a simple model for the 
evolution of galaxies in the size-mass plane. We describe the 
average evolution of star-forming galaxies by the simple re¬ 
lation Alogre ^ 0.3AlogMstars, with the mass evolution pro¬ 
portional to the main sequence star formation rate. We show 
that this evolution is a consistent feature in galaxy formation 
models of Hirschmann et al. (2013), Wellons et al. (2015), 


and Zolotov et al. (2015), and is also seen in observations of 
number density-matched samples of galaxies (van Dokkum 
et al. 2013; Patel et al. 2013; Ownsworth et al. 2014). 

As galaxies move along this track their average 3D density 
within re remains approximately constant (as pir^) (xM/rl, it 
follows that re oc if the density is constant). However, 
their density within a fixed physical radius increases, as does 
their projected (2D) density and their velocity dispersion. Fol¬ 
lowing many other studies (e.g., Franx et al. 2008; Bell et al. 
2012), we assume that quenching occurs when galaxies reach 
a threshold in either velocity dispersion or physical density. 
We show that this model explains the evolution of the distribu¬ 
tion of galaxies in the size-mass plane from z 2.6 to z 1.9, 
the redshift range when the number density of massive com¬ 
pact quiescent galaxies increases by nearly an order of magni¬ 
tude. In the context of this straightforward model, the progen¬ 
itors of compact massive star forming galaxies at z 2.5 were 
simply somewhat less massive and slightly smaller galaxies at 
^ ^ 3. 

Our study has several important systematic uncertainties. 
First, the stellar masses of the galaxies are derived from fit¬ 
ting stellar population synthesis models to the photometry, 
and these models have not been tested for the extreme galax¬ 
ies that are under discussion in this paper. Such tests are ur¬ 
gently needed but they are difficult, even for quiescent galax¬ 
ies and for “normal” star forming galaxies in the local Uni¬ 
verse (Muzzin et al. 2009b; Conroy 2013). One interpreta¬ 
tion of Fig. [TOb is that the stellar masses are off by factors 
up to ^ 10; however, as we show in the remainder of Sect. 5 
the dynamical masses and stellar masses are consistent with 
each other once orientation effects and the spatial extent of 
the gas are taken into account. Our final dynamical result 
(Mfit = 0.8°5®4 X Mstars; Sect. 16.41) suggests that the contribu¬ 
tions of dark matter and gas to the mass within ^ 7 kpc are 
small. We have assumed a relatively bottom-light Chabrier 
(2003) IMF when deriving stellar masses; if we assume a 
Salpeter (1955) IMF instead (see, e.g., van Dokkum & Con¬ 
roy 2010; Conroy & van Dokkum 2012; Cappellari et al. 
2012) we find Mfn = 0.5;!;q2 x Mstars, and even tighter con¬ 
straints on the amount of gas and dark matter. We emphasize, 
however, that the conversion of light to stellar mass for these 
dusty, compact star forming galaxies is highly uncertain. We 
also note here that the stellar masses are not corrected for the 
contribution of emission lines to the SEDs. These corrections 
are generally small (^ 10%). 

Second, the role of winds and activ e nuclei in these galax¬ 
ies is not well understood (Sect. 19.21 . They almost certainly 
influence the measured dynamics and line ratios, but without 
spatially-resolved data it is very difficult to disentangle the 
effects of winds, a falling rotation curve, and the spatial dis¬ 
tribution of the ionized gas. Third, the fact that the galaxies 
are all very dusty may imply that we are miss ing part of the 
population due to selection effects (Sect. 19.3b . We could be 
missing galaxies outright (see Fig. 3 in Nelson et al. 2014), or 
they could be misclassified as less compact, lower mass galax¬ 
ies if only their outer edges are detected in the currently avail¬ 
able data. Another potential effect of the dust is that the stel¬ 
lar population modeling may produce incorrect stellar masses: 
the modeling uses a screen approximation for dust, whereas 
in reality the dust and stars are almost certainly mixed. 

Fortunately, the prospects for addressing these uncertainties 
are excellent. Adaptive optics-assisted spectroscopic observa¬ 
tions with integral field units on 8 m - 10 m telescopes can be 
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used to measure kinematics and line ratios on < 1 kpc scales 
(e.g., Newman et al. 2013). The morphology of the dust and 
molecular gas emission can be studied with interferometers 
such as the Very Large Array, the Plateau de Bure Interfer¬ 
ometer, and the Atacama Large Millimeter Array (see, e.g., 
Simpson et al. 2015, for impressive early ALMA results on 
submm-selected galaxies). These instruments can also mea¬ 
sure the kinematics of the molecular gas (e.g., Tacconi et al. 
2008). On a longer timescale, the James Webb Space Tele¬ 
scope can measure the stellar kinematics of the galaxies, as 
well as identify and characterize compact galaxies that are 
entirely obscured in the K band (Wang et al. 2012). Fi¬ 
nally, the upcoming generation of extremely large ground- 
based optical/near-IR telescopes is needed to spatially resolve 
these compact, massive galaxies within their effective radius. 
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APPENDIX 
A. Hm IMAGES 

In the main text we show col or im ages of the 25 star forming compact massive galaxies, created from the ^125 and //leo 
CANDELS data (Eig. 6 ). In Eig. lAll we show the //igo images separately, with a higher dynamic range than in Eig. 6 . The tidal 
features around GOODS-S_30274 and COSMOS_11363 are very clear, and several other galaxies also show structure at faint 
surface brightness. We fit all galaxies with a single Sersic profile, which is an excellent approximation of the average surface 
brightness profile of the full sample (see Sect. I7.21 i: however, it is clear that these fits do not capture the full information in the 
HST images. 
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Figure Al. HST images of the galaxies in Figs. 5, 6, 7, and 15, in the //160 band. The galaxies are displayed with a high dynamic range, so that 
faint structures around bright cores can be seen more clearly than in Fig. 6 of the main text. GOODS-S_3027 4 and COSMOS_11363 show 
clear tidal features. 
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B. EXPECTED AND OBSERVED UNCERTAINTIES IN THE SPECTRA 

As described in Sect. 3.4.1 we fit Gaussian models to the emission lines. The fits are done with the emcee code, with the 
observed ID spectrum and a noise model as inputs for each galaxy. Here we briefly analyze the residuals from these fits to 
determin e the accuracy of the noise models. 

In Eig. lBlI we show the spectra of the 20 galaxies that were observed by us. Eor convenience, the figure has the same format 
as Eigs. 5, 6, 7, and 15 in the main text, except that the five galaxies from Barro et al. (2014) are left blank. Eor each galaxy 
three subpanels are shown. The top subpanel is identical to the main panel of Eig. 5, and shows the observed spectrum in black 
along with the best-fitting model in red. The middle subpanel shows the noise model (empirical in the case of MOSEIRE and 
theoretical in the case of NIRSPEC; see Sect. 3.1 and Sect. 3.2). The bottom subpanel is the residual from the fit divided by the 
noise model. 



Figure Bl. Analysis of the noise in the NIRSPEC and MOSEIRE spectra. The galaxies have the same order as in Fig.|5] panels for objects taken 
from Barro et al. (2014) are left blank. Eor each galaxy, the top panel shows the spectrum and the best-fitting model; the middle panel shows 
the expected noise (see Sect. 3.1 and Sect. 3.2); and the bottom panel shows the difference between the observed spectrum and the best-fitting 
model divided by the expected noise. The width of the distribution of these residuals is ~ 1 in nearly all cases. 

The residuals are well-behaved, and generally exhibit no indications of poorly subtracted sky lines or other irregularities. We 
quantified this by calculating the biweight scatter (Tbi (see Beers et al. 1990) in the distribution of residuals. The value of ctbi 
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deviates by more than ^30% from unity in only two cases, UDS_35673 and COSMOS_l 1363. Both galaxies have very high 
S/N ratio spectra, and the higher than expected residuals are not caused by errors in the noise spectra but by the fact that the 
velocity distributions are not exactly Gaussian. The average scatter of the remaining 18 galaxies is (ctbi) = 1.09, which means 
that the noise models that we use are accurate to ^ 10 %. 

C. CONVERTING GALAXY-AVERAGED VELOCITY DISPERSIONS TO A ROTATION CURVE 

Motivation 

In Sect. 16.41 we construct the average rotation curve for star forming compact massive galaxies. This is done by combining 
information for 10 different galaxies: all galaxies have approximately the same stellar masses and H\^ half-light radii, but they 
have a wide range of Ha effective radii. Eor each galaxy we measure the galaxy-integrated velocity dispersion and the inclination, 
and convert these to an inclination-corrected rotation velocity at r = where rgas is the half-light radius of the Ha emission. 
The rotation velocities of the galaxies are then plotted versus rgas in Eig. [18] and the resulting relation is interpreted as a rotation 
curve. 
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Figure Cl. a) Surface density profile of a model galaxy with a stellar mass of 10*’ Mq, Sersic index n = A, and an effective radius rmass = 1 kpc 
(grey). Black lines show four different Ha n=\ surface brightness profiles, with effective radii ranging from 0.5 kpc to 4 kpc. b) Rotation curve 
of the Ha-emitting gas disks in the model galaxy. The Ha emission is assumed to be a tracer, not a contributor, to the mass, and the rotation 
curve is identical in all four models, c-f) Observed galaxy-integrated Ha velocity profiles for the four surface brightness profiles shown in panel 
a, assuming an inclination of 60° and an instrumental resolution of 60 km s“'. The red curves are Gaussian fits to the observed profiles. The 
measured dispersion is lower for higher values of rna/fmass, as the profile is weighted toward larger radii. 


Here we test whether this method is viable, that is, whether the actual rotation curve of a model galaxy can be reconstructed 
in this way. We also test whether we are using the correct conversion constant to go from a galaxy-integrated velocity dispersion 
to a rotation velocity at the half-light radius of Ha. This constant, together with an inclination correction, relates the velocity 
dispersion a to the rotation velocity Vrot: 


a = 


Trot sin”'(0 


(Cl) 


(see Eq. [17] and Eq. [T9] l. In the main text we use a = 0.8 ± 0.2, based on previous studies (see Sect. 15.11 1. However, these 
studies did not consider the specific model of a compact, r'^^-law mass distribution combined with an extended, exponential gas 
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distribution. 


Modeling Velocity Profiles 

We simulated the observations in the following way. We constructed a model mass distribution that follows a Sersic surface 
density prohle. This mass distribution is characterized by three parameters: the Sersic index n, the effective radius rmass (this 
parameter is equivalent to both rstar., and re in the main text), and the total mass M. We hxed rmass = 1 kpc and M = l.Ox 10 "Mq, 
and for the initial model we set n = 4. Apart from a slight rescaling of the effective radius, this model cl osely matches the actual 
average stellar mass distribution of the sCMGs, if mass traces the i/ieo light. The model is shown in Fig. lClh by the grey line. 

Next, we constructed 10 model galaxies, each with the same mass distribution but with different distributions of the Ha 
emission. The ionized gas is in thin exponential disks, with effective radii ranging from r ug = 0.5 kpc (and hence r^g = O.Srmass) 
to rHa = 5 kpc. Four of these model gas distributions are shown by the black lines in Fig. IClh . The gas disks mimic the derived 
extended ionized gas of sCMGs, with rna equivalent to the parameter rgas in the main text. Galaxy-integrated velocity prohles 
were created by integrating the projected velocities along the line of sigh t and over the full spatial extent of the model galaxies. 
The velocities were calculated from the mass prohle shown in Fig. IClh and weighted by the Ha Hux. In order to model the 
observed prohles as closely as possible, we used an inclination of 60° (where 90° is edge-on) and an instrumental resolution of 
60 km s“' (in between the MOSFIRE and NIRSPEC resolution). 

The velocity prohles of the four model galaxies are shown in panels c-f of Fig. ICll As expected they have the classic “double¬ 
horned” form that is characteristic of rotating disks. The prohle is not the same for all four models even though the mass 
distribution, and hence the underlying velocity held, is identical in all cases. The more extended the Ha distribution is with 
respect to the mass, the narrower the prohle becomes, and the more closely it resembles a Gaussian. The reason for this behavior 
is that the Ha emission is more weighted toward larger radii, where the rotation velocity is lower. Velocities in excess of 
^ 350 km s“* are still sampled, but they have relatively low weight and are responsible for the high velocity tails of the prohle. 

Relation Between Global Dispersion and Rotation Velocity at r = rnct 

We htted Gaussian models to the line prohles, jus t as we do in the data analysis described in the main text. These Gaussian 
hts are shown by the red curves in panels c-f of Fig. lCTl The width of these Gaussians decreases with increasing nta/rmass, as 
discussed above. We note here that the actual prohle shape is not very well approximated by a Gaussian, particular in panels c and 
d. Interestingly, we see hints of double-horned prohles in the data for some of the galaxies (e.g., UDS_16442 and, particularly, 
GOODS- N_7 74, which was published in Nelson et al. 2014), although the S/N ratio is not high enough to quantify this. 

In Fig. IC2fa these measured galaxy-integrated velocity dispersions are plotted versus the half-light radii of the Ha disks, after 
correcting for inclination and instrumental broadening (open squares). All ten galaxy models are shown, with Ha effective radii 
ranging from 0.5 x rnia.ss to 5 x rmass- For comparison, the black curve shows the actual rotation curve of the galaxies. The squares 
show the same fall-off as the actual rotation curve, with a roughly constant multiplicitative offset. The solid square s are obtained 
by dividing the measured dispersions by 0.8, which is the value of a = a/V^t that we used in the analysis of Sect. 16.41 They are 
in almost perfect agreement with the black curve, demonstrating that it is possible to reconstruct the average rotation curve of 
sCMGs with our method. 
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Figure C2. a) Rotation curve of the model in Fig. lClI tblack line), compared to the inclination-corrected, galaxy-integrated velocity dispersion 
a for 10 different Ha distributions (open red squares). The half-light radius of the Ha emission ranges from 0.5 x rmass to 5 x rmass. Solid red 
squares are corrected for the parameter a = cr/Viot =0.8. b) Derived values of a from our model (black lines). The value a = 0.8 ±0.2 that is 
used in the main text is shown by the orange line. Different line types indicate results for different Sersic indices n of the mass distribution; the 
value of a is nearly independent of n. 
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The analysis is generalized in Fig. IC2b . where we show the value of a as a function of the ratio of the effective radius of Ha 
and the effective radius of the mass. We repeated the analysis for different assumed mass profiles, ranging from exponential 
(n = 1; dotted) to an law (n = 4; solid). The ratio between dispersion and rotation velocity at r = rna/rmass is remarkably 
constant; it does not vary appreciably either with r or with n. We conclude that the assumed value of a = 0.8 ± 0.2 is reasonable 
for the mass and Ha profiles discussed in this paper. 


